Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Nowadays processing systems are asked to support increasing complex and demanding high-performance applications, especially in the signal processing and video processing domains. The design of these systems are becoming extremely challenging because of sev ...
Nowadays, the design flow of complex signal processing embedded systems starts with a specification of the application by means of a large and sequential program (usually in C/C++). As we are entering in the multicore era, sequential programs are no longer ...
Until recently, the ever-increasing demand of computing power has been met on one hand by increasing the operating frequency of processors and on the other hand by designing architectures capable of exploiting parallelism at the instruction level through h ...
The multicore revolution and the ever-increasing complexity of computing systems is dramatically changing system design, analysis and programming of computing platforms. Future architectures will feature hundreds to thousands of simple processors and on-ch ...
Every wave solver serving the computational study of waves meets a trade-off of two figures of merit—its computational speed and its accuracy. The use of Discontinuous Galerkin (DG) methods on graphical processing units (GPUs) significantly lowers the cost ...
Fine-grain data parallelism is increasingly common in mainstream processors in the form of long vectors and on-chip GPUs. This paper develops compiler and runtime support to exploit such data parallelism for non-numeric, non-graphic, irregular parallel tas ...
Classical list scheduling is a very popular and efficient technique for scheduling jobs in parallel platforms. However, with the increasing number of processors, the cost for managing a single centralized list becomes prohibitive. The objective of this wor ...
This paper presents a real-time processing platform for high definition stereo video. The system is capable to process stereo-video streams at resolutions up to 1920x1080 at 30 frames per second (1080p30). In the hybrid FPGA-GPU-CPU system, a high-density ...
A technique to speed up Montgomery multiplication targeted at the Synergistic Processor Elements (SPE) of the Cell Broadband Engine is proposed. The technique consists of splitting a number into four consecutive parts. These parts are placed one by one in ...
Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa2010
This paper describes a methodology for the optimization of portable parallel signal processing applications specified by dataflow programs. The use of dataflow as a programming model for signal processing applications targeting parallel platforms provides ...
Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa2011