Modern datacenters host datasets in DRAM to offer large-scale online services with tight tail-latency requirements. Unfortunately, as DRAM is expensive and increasingly difficult to scale, datacenter operators are forced to consider denser storage technologies. While modern flash-based storage exhibits us-scale access latency, which is well within the tail-latency constraints of many online services, traditional demand paging abstraction used to manage memory and storage incurs high overheads and prohibits flash usage in online services. We introduce AstriFlash, a hardware-software co-design that tightly integrates flash and DRAM with ns-scale overheads. Our evaluation of server workloads with cycle-accurate full-system simulation shows that AstriFlash achieves 95% of a DRAM-only system's throughput while maintaining the required 99th-percentile tail latency and reducing the memory cost by 20x.
Aleksandra Radenovic, Andras Kis, Mukesh Kumar Tripathi, Zhenyu Wang, Asmund Kjellegaard Ottesen, Yanfei Zhao, Guilherme Migliato Marega, Hyungoo Ji
, , ,