Running Efficiently CNNs on the Edge Thanks to Hybrid SRAM-RRAM In-Memory Computing

The increasing size of Convolutional Neural Networks (CNNs) and the high computational workload required for inference pose major challenges for their deployment on resource-constrained edge devices. In this paper, we address them by proposing a novel In-Memory Computing (IMC) architecture. Our IMC strategy allows us to efficiently perform arithmetic operations based on bitline computing, enabling a high degree of parallelism while reducing energy-costly data transfers. Moreover, it features a hybrid memory structure, where a portion of each subarray, dedicated to storing CNN weights, is implemented as high-density, zero-standby-power Resistive RAM. Finally, it exploits an innovative method for storing quantized weights based on their value, named Weight Data Mapping (WDM), which further increases efficiency. Compared to state-of-the-art IMC alternatives, our solution provides up to 93% improvements in energy efficiency and up to 6x less run-time when performing inference on Mobilenet and AlexNet neural networks.

Running Efficiently CNNs on the Edge Thanks to Hybrid SRAM-RRAM In-Memory Computing

Graph Chatbot

Chat with Graph Search

Exploring High-Performance and Energy-Efficient Architectures for Edge AI-Enabled Applications

A semi-supervised algorithm for improving the consistency of crowdsourced datasets: The COVID-19 case study on respiratory disorder classification

Data-driven approaches for non-invasive cuffless blood pressure monitoring

A semi-supervised algorithm for improving the consistency of crowdsourced datasets: The COVID-19 case study on respiratory disorder classification

Exploring High-Performance and Energy-Efficient Architectures for Edge AI-Enabled Applications

Data-driven approaches for non-invasive cuffless blood pressure monitoring