Automatic Generation of Efficient Accelerators for Reconfigurable Hardware

Christos Kozyrakis, Yunqi Zhang
2016
Article de conférence

Résumé

Acceleration in the form of customized datapaths offer large performance and energy improvements over general purpose processors. Reconfigurable fabrics such as FPGAs are gaining popularity for use in implementing application-specific accelerators, thereby increasing the importance of having good high-level FPGA design tools. However, current tools for targeting FPGAs offer inadequate support for high-level programming, resource estimation, and rapid and automatic design space exploration. We describe a design framework that addresses these challenges. We introduce a new representation of hardware using parameterized templates that captures locality and parallelism information at multiple levels of nesting. This representation is designed to be automatically generated from high-level languages based on parallel patterns. We describe a hybrid area estimation technique which uses template-level models and design-level artificial neural networks to account for effects from hardware place-and-route tools, including routing overheads, register and block RAM duplication, and LUT packing. Our runtime estimation accounts for off-chip memory accesses. We use our estimation capabilities to rapidly explore a large space of designs across tile sizes, parallelization factors, and optional coarse-grained pipelining, all at multiple loop levels. We show that estimates average 4.8% error for logic resources, 6.1% error for runtimes, and are 279 to 6533 times faster than a commercial high-level synthesis tool. We compare the best-performing designs to optimized CPU code running on a server-grade 6 core processor and show speedups of up to 16.7x.

Source officielle

https://infoscience.epfl.ch/record/224724?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Automatic Generation of Efficient Accelerators for Reconfigurable Hardware

Graph Chatbot

Chattez avec Graph Search

EdgeAI-Aware Design of In-Memory Computing Architectures

Contemporary Logic Synthesis: with an Application to AQFP Circuit Optimization

Exploring High-Performance and Energy-Efficient Architectures for Edge AI-Enabled Applications

Contemporary Logic Synthesis: with an Application to AQFP Circuit Optimization

EdgeAI-Aware Design of In-Memory Computing Architectures

Exploring High-Performance and Energy-Efficient Architectures for Edge AI-Enabled Applications