Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture introduces PAX, a hybrid solution for efficient data storage, decomposing slotted-pages into mini-pages per attribute to improve cache-friendliness and reduce I/O delays. It discusses how PAX can replace NSM in-place, its adoption by major database systems, and its benefits for analytical queries. The lecture also covers Parquet, a columnar storage format for Hadoop, which enables efficient data processing by storing only relevant data and supporting nested data structures.