Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Fine-grain data parallelism is increasingly common in mainstream processors in the form of long vectors and on-chip GPUs. This paper develops compiler and runtime support to exploit such data parallelism for non-numeric, non-graphic, irregular parallel tas ...
The multicore revolution and the ever-increasing complexity of computing systems is dramatically changing system design, analysis and programming of computing platforms. Future architectures will feature hundreds to thousands of simple processors and on-ch ...
Until recently, the ever-increasing demand of computing power has been met on one hand by increasing the operating frequency of processors and on the other hand by designing architectures capable of exploiting parallelism at the instruction level through h ...
Classical list scheduling is a very popular and efficient technique for scheduling jobs in parallel platforms. However, with the increasing number of processors, the cost for managing a single centralized list becomes prohibitive. The objective of this wor ...
Nowadays processing systems are asked to support increasing complex and demanding high-performance applications, especially in the signal processing and video processing domains. The design of these systems are becoming extremely challenging because of sev ...
This paper presents a real-time processing platform for high definition stereo video. The system is capable to process stereo-video streams at resolutions up to 1920x1080 at 30 frames per second (1080p30). In the hybrid FPGA-GPU-CPU system, a high-density ...
This paper describes a methodology for the optimization of portable parallel signal processing applications specified by dataflow programs. The use of dataflow as a programming model for signal processing applications targeting parallel platforms provides ...
Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa2011
Nowadays, the design flow of complex signal processing embedded systems starts with a specification of the application by means of a large and sequential program (usually in C/C++). As we are entering in the multicore era, sequential programs are no longer ...
A technique to speed up Montgomery multiplication targeted at the Synergistic Processor Elements (SPE) of the Cell Broadband Engine is proposed. The technique consists of splitting a number into four consecutive parts. These parts are placed one by one in ...
Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa2010
Every wave solver serving the computational study of waves meets a trade-off of two figures of merit—its computational speed and its accuracy. The use of Discontinuous Galerkin (DG) methods on graphical processing units (GPUs) significantly lowers the cost ...