Micro BTB: A High Performance and Storage Efficient Last-Level Branch Target Buffer for Servers

High-performance branch target buffers (BTBs) and the L1I cache are key to high-performance front-end. Modern branch predictors are highly accurate, but with an increase in code footprint in modern-day server workloads, BTB and L1I misses are still frequent. Recent industry trend shows usage of large BTBs (100s of KB per core) that provide performance closer to the ideal BTB along with a decoupled front-end that provides efficient fetch-directed L1I instruction prefetching. On the other hand, techniques proposed by academia, like BTB prefetching and using retire order stream for learning, fail to provide significant performance with modern-day processor cores that are deeper and wider.

Micro BTB: A High Performance and Storage Efficient Last-Level Branch Target Buffer for Servers

Graph Chatbot

Intermediate Address Space: virtual memory optimization of heterogeneous architectures for cache-resident workloads

Highly Parallel RTL Simulation

EdgeAI-Aware Design of In-Memory Computing Architectures

Intermediate Address Space: virtual memory optimization of heterogeneous architectures for cache-resident workloads

Highly Parallel RTL Simulation

EdgeAI-Aware Design of In-Memory Computing Architectures