Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full-precision floating-point arithmetic t ...
Deep neural network inference accelerators are deployed at scale to accommodate online services, but face low average load because of service demand variability, leading to poor resource utilization. Unfortunately, reclaiming inference idle cycles is diffi ...
The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full-precision floating-point arithmetic t ...
In this paper we present a reconfigurable device which significantly improves the execution time of the most computational intensive functions of three of the most widely used face recognition algorithms; those tasks multiply very large dense matrices. The ...
Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa2009
,
This paper presents Cyme, a C++ library aiming at abstracting the usage of SIMD instructions while maximizing the usage of the underlying hardware. Unlike similar efforts such as Boost.simd or VC, Cyme provides generic high level containers to the users wh ...
This paper presents software implementation speed records for modular multiplication arithmetic on the synergistic processing elements of the Cell broadband engine (Cell) architecture. The focus is on moduli which are of special interest in elliptic curve ...
Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa2010
This paper presents software implementation speed records for modular multiplication arithmetic on the synergistic processing elements of the Cell broadband engine (Cell) architecture. The focus is on moduli which are of special interest in elliptic curve ...
As feature sizes decrease, power dissipation and heat generation density exponentially increase. Thus, temperature gradients in Multiprocessor Systems on Chip (MPSoCs) can seriously impact system performance and reliability. Thermal balancing policies base ...
This paper presents a low-latency algorithm designed for parallel computer architectures to compute the scalar multiplication of elliptic curve points based on approaches from cryptographic side-channel analysis. A graphics processing unit implementation u ...
A modular parallel architecture for a MMSE-DFE coefficient computation processor is presented. The architecture is based on QR factorization of a channel-and-noise-dependent data matrix and is implemented using CORDIC processors within a systolic array arc ...