Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
State-of-the-art Artificial Intelligence (AI) algorithms, such as graph neural networks and recommendation systems, require floating-point computation of very large matrix multiplications over sparse data. Their execution in resource-constrained scenarios, like edge AI systems, requires a) careful optimization of computing patterns, leveraging sparsity as an opportunity to lower computational requirements, and b) using dedicated hardware. In this paper, we introduce a novel near-memory floating-point computing architecture dedicated to the parallel processing of sparse matrix-vector multiplication (SpMV). This architecture can be integrated at the periphery of memory arrays to exploit the inherent parallelism of memory structures to speed up computation. In addition, it uses its proximity to memory to achieve high computational capability and very low latency. The illustrated implementation, operating at 1GHz, can compute up to 370 MFLOPS (millions of floating-point operations per second) while computing SpMV multiplications, while incurring a modest 17% area overhead when interfaced with a 4KB SRAM array.
David Atienza Alonso, Giovanni Ansaloni, Alexandre Sébastien Julien Levisse, Marco Antonio Rios, Flavio Ponzina
Anastasia Ailamaki, Viktor Sanca