Low power (mW) and high performance (GOPS) are strong requirements for compute-intensive signal processing in E-health, Internet-of-Things, and wearable applications. This work presents a building block for programmable Ultra-Low Power accelerators, namely a tightly-coupled computing cluster that supports parallel and sequential execution at high energy efficiency over a wide range of workload requirements. The cluster, implemented in 28nm UTBB FD-SOI technology, achieves peak energy efficiency in the near-threshold (NVT) operating region: 193 MOPS/mW at 162 MOPS for parallel workloads, and 90 MOPS/mW at 68 MOPS for sequential workloads at 0.46V and 0.5V, respectively. The energy efficient operating range is wide (0.32V to 1.15V), also meeting the design goal of 1 GOPS within a 10 mW power envelope (at 0.66V).
David Atienza Alonso, Miguel Peon Quiros, José Angel Miranda Calero, Hossein Taji
Giovanni Ansaloni, Alexandre Sébastien Julien Levisse, Pengbo Yu, Flavio Ponzina
David Atienza Alonso, Alexandre Sébastien Julien Levisse, Miguel Peon Quiros, Simone Machetti, Pasquale Davide Schiavone