Automatic Generation of Efficient Accelerators for Reconfigurable Hardware

Christos Kozyrakis, Yunqi Zhang
2016
Conference paper

Abstract

Acceleration in the form of customized datapaths offer large performance and energy improvements over general purpose processors. Reconfigurable fabrics such as FPGAs are gaining popularity for use in implementing application-specific accelerators, thereby increasing the importance of having good high-level FPGA design tools. However, current tools for targeting FPGAs offer inadequate support for high-level programming, resource estimation, and rapid and automatic design space exploration. We describe a design framework that addresses these challenges. We introduce a new representation of hardware using parameterized templates that captures locality and parallelism information at multiple levels of nesting. This representation is designed to be automatically generated from high-level languages based on parallel patterns. We describe a hybrid area estimation technique which uses template-level models and design-level artificial neural networks to account for effects from hardware place-and-route tools, including routing overheads, register and block RAM duplication, and LUT packing. Our runtime estimation accounts for off-chip memory accesses. We use our estimation capabilities to rapidly explore a large space of designs across tile sizes, parallelization factors, and optional coarse-grained pipelining, all at multiple loop levels. We show that estimates average 4.8% error for logic resources, 6.1% error for runtimes, and are 279 to 6533 times faster than a commercial high-level synthesis tool. We compare the best-performing designs to optimized CPU code running on a server-grade 6 core processor and show speedups of up to 16.7x.

Official source

https://infoscience.epfl.ch/record/224724?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Automatic Generation of Efficient Accelerators for Reconfigurable Hardware

Graph Chatbot

Chat with Graph Search

EdgeAI-Aware Design of In-Memory Computing Architectures

Contemporary Logic Synthesis: with an Application to AQFP Circuit Optimization

Exploring High-Performance and Energy-Efficient Architectures for Edge AI-Enabled Applications

EdgeAI-Aware Design of In-Memory Computing Architectures

Contemporary Logic Synthesis: with an Application to AQFP Circuit Optimization

Exploring High-Performance and Energy-Efficient Architectures for Edge AI-Enabled Applications