Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Modern data management systems aim to provide both cutting-edge functionality and hardware efficiency. With the advent of AI-driven data processing and the post-Moore Law era, traditional memory-bound scale-up data management operations face scalability challenges. On the other hand, using accelerators such as GPUs has long been explored to offload complex analytical patterns while trading-off data movement over an interconnect. GPUs typically provide massive parallelism and high-bandwidth memory, while CPUs are near-data processors and coordinators that are often memory-bound. In this work, we provide a first look over an architecture that mixes the best of the CPU and GPU world: high-bandwidth memory (HBM), core-local accelerators for matrix multiplications (AMX), and native half-precision data processing inside 4th Generation Intel Xeon Scalable processors known as Sapphire Rapids. We analyze the system, provide an overview of its hierarchical NUMA architecture, focus on individual components, and explore their interplay and how they impact the traditional DRAM bandwidth wall on typical data access patterns and novel AI-DB interactions of vector data processing.
Aurélien François Gilbert Bloch
David Atienza Alonso, Giovanni Ansaloni, Alireza Amirshahi
Anastasia Ailamaki, Viktor Sanca, Hamish Mcniece Hill Nicholson, Andreea Nica, Syed Mohammad Aunn Raza