Heterogeneous computing refers to systems that use more than one kind of processor or core. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar coprocessors, usually incorporating specialized processing capabilities to handle particular tasks. Usually heterogeneity in the context of computing referred to different instruction-set architectures (ISA), where the main processor has one and other processors have another - usually a very different - architecture (maybe more than one), not just a different microarchitecture (floating point number processing is a special case of this - not usually referred to as heterogeneous). In the past heterogeneous computing meant different ISAs had to be handled differently, while in a modern example, Heterogeneous System Architecture (HSA) systems eliminate the difference (for the user) while using multiple processor types (typically CPUs and GPUs), usually on the same integrated circuit, to provide the best of both worlds: general GPU processing (apart from the GPU's well-known 3D graphics rendering capabilities, it can also perform mathematically intensive computations on very large data-sets), while CPUs can run the operating system and perform traditional serial tasks. The level of heterogeneity in modern computing systems is gradually increasing as further scaling of fabrication technologies allows for formerly discrete components to become integrated parts of a system-on-chip, or SoC. For example, many new processors now include built-in logic for interfacing with other devices (SATA, PCI, Ethernet, USB, RFID, radios, UARTs, and memory controllers), as well as programmable functional units and hardware accelerators (GPUs, cryptography co-processors, programmable network processors, A/V encoders/decoders, etc.). Recent findings show that a heterogeneous-ISA chip multiprocessor that exploits diversity offered by multiple ISAs can outperform the best same-ISA homogeneous architecture by as much as 21% with 23% energy savings and a reduction of 32% in Energy Delay Product (EDP).

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (5)
CS-470: Advanced computer architecture
The course studies techniques to exploit Instruction-Level Parallelism (ILP) statically and dynamically. It also addresses some aspects of the design of domain-specific accelerators. Finally, it explo
CS-476: Embedded system design
Hardware-software co-design is a well known concept in embedded system design.It is also a concept required in designing FPGA-accelerators in data-centers.This course teaches how to transform algorith
MATH-454: Parallel and high-performance computing
This course provides insight into a broad variety of High Performance Computing (HPC) concepts and the majority of modern HPC architectures. Moreover, the student will learn to have a feeling about wh
Show more
Related lectures (25)
Parallelism: Programming and Performance
Explores parallelism in programming, emphasizing trade-offs between programmability and performance, and introduces shared memory parallel programming using OpenMP.
GPU Advanced: Memory Management and Optimization
Explores advanced GPU memory management, CUDA programming, and data-parallel computing for optimizing performance.
Multi Masters Systems
Explores Multi Masters Systems, discussing architectures with multiple processors, shared memory, mutual exclusion, and hardware accelerators.
Show more
Related publications (151)

DBFS: Dynamic Bitwidth-Frequency Scaling for Efficient Software-defined SIMD

Giovanni Ansaloni, Alexandre Sébastien Julien Levisse, Pengbo Yu, Flavio Ponzina

Machine learning algorithms such as Convolutional Neural Networks (CNNs) are characterized by high robustness towards quantization, supporting small-bitwidth fixed-point arithmetic at inference time with little to no degradation in accuracy. In turn, small ...
2024

HEEPocrates: An Ultra-Low-Power RISC-V Microcontroller for Edge-Computing Healthcare Applications

David Atienza Alonso, Alexandre Sébastien Julien Levisse, Miguel Peon Quiros, Simone Machetti, Pasquale Davide Schiavone

The field of edge computing in healthcare has seen remarkable growth due to the increasing demand for real-time processing of data in applications. However, challenges persist due to limitations in healthcare devices' performance and power efficiency. To o ...
Europractice2024

Intermediate Address Space: virtual memory optimization of heterogeneous architectures for cache-resident workloads

David Atienza Alonso, Marina Zapater Sancho, Luis Maria Costero Valero, Darong Huang, Qunyou Liu

The increasing demand for computing power and the emergence of heterogeneous computing architectures have driven the exploration of innovative techniques to address current limitations in both the compute and memory subsystems. One such solution is the use ...
2024
Show more
Related concepts (16)
Multiprocessor system on a chip
A multiprocessor system on a chip ( (), ˌɛmˌpiː'sɒk or ˌɛmˌpiːˌɛsˌoʊˈsiː ) is a system on a chip (SoC) which includes multiple microprocessors. As such, it is a multi-core system on a chip. MPSoCs are usually targeted for embedded applications. It is used by platforms that contain multiple, usually heterogeneous, processing elements with specific functionalities reflecting the need of the expected application domain, a memory hierarchy and I/O components.
Vision processing unit
A vision processing unit (VPU) is (as of 2023) an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks. Vision processing units are distinct from video processing units (which are specialised for video encoding and decoding) in their suitability for running machine vision algorithms such as CNN (convolutional neural networks), SIFT (scale-invariant feature transform) and similar.
Manycore processor
Manycore processors are special kinds of multi-core processors designed for a high degree of parallel processing, containing numerous simpler, independent processor cores (from a few tens of cores to thousands or more). Manycore processors are used extensively in embedded computers and high-performance computing. Manycore processors are distinct from multi-core processors in being optimized from the outset for a higher degree of explicit parallelism, and for higher throughput (or lower power consumption) at the expense of latency and lower single-thread performance.
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.