Aggressive memory-level-parallelism techniques have provided significant performance gain in Distributed Share Memory Designs. In this paper, we reevaluate speculative memory ordering in the context of Chip Multi-Processors (CMPs) and power-limited computation. We evaluate relative performance between Sequential Consistency, Total Store Order and Relaxed Memory Order on a selection of modern workloads to predict the performance of the ARM weakly consistent memory model.
Laurent Villard, Stephan Brunner, Emmanuel Lanti, Noé Thomas Elie Ohana, Claudio Gheller
David Atienza Alonso, Marina Zapater Sancho, Alexandre Sébastien Julien Levisse, Mohamed Mostafa Sabry Aly, Halima Najibi