Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.
In large scale data centers, network infrastructure is becoming a major cost component; as a result, operators are trying to reduce expenses, and in particular lower the amount of hardware needed to achieve their performance goals (or to improve the performance achieved for a given amount of hardware).
In this thesis, we explore the use of locality in the data center to meet these demands. In particular, we leverage locality in two different projects: (1) VNToR, which moves network virtualization from the server to the top-of-rack (ToR) switch, thereby reducing the server hardware needed to achieve a certain performance, and (2) Criss-Cross, which makes the network topology reconfigurable, thereby reducing the network hardware needed to switch typical data center workloads with a given level of performance.
VNToR exploits the locality of traffic flows as well as their long-tailed behavior in the design of a virtual flow table, which extends the hardware flow table of off-the-shelf top-of-rack switches. VNToR uses this virtual flow table in a hybrid data plane that consists of both a hardware as well as a software data plane. This way it can (1) store tens or even hundreds of thousands of access rules, (2) adapt to traffic-pattern changes, typically in less than one millisecond, and (3) uses only commodity switching hardware with a minimal amount of data path memory (4)~without compromising latency or throughput.
Criss-Cross is a hierarchical, reconfigurable topology for large-scale data centers. The locality in rack-level flows allows Criss-Cross to adjust its topology to the current traffic patterns. We show that Criss-Cross preserves many of the advantages of Clos topologies: (1) it maintains their hierarchy, (2) the simple routing algorithms, (3) their regular layout of connections for simple physical deployability, and (4) the compatibility to existing management approaches. We demonstrate that for a group-based communication pattern, Criss-Cross improves the average flow completion time by 5.5x and the 99th percentile by 6.3x. For a purely random point-to-point traffic pattern, it improves the flow completion time by 2.2x on average and 3x at the 99th percentile.
Pauline Geneviève Thérèse Hosotte