This article examines the cost/performance of simulating a hypothetical target parallel computer using a commercial host parallel computer. We address the question of whether parallel simulation is simply faster than sequential simulation, or if it is also more cost-effective. To answer this, we develop a performance model of the Wisconsin Wind Tunnel (WWT), a system that simulates cache-coherent shared-memory machines on a message-passing Thinking Machines CM-5. The performance model uses Kruskal and Weiss's fork-join model to account for the effect of event processing time variability on WWT's conservative fixed-window simulation algorithm. A generalization of Thiebaut and Stone's footprint model accurately predicts the effect of cache interference on the CM-5. The model is calibrated using parameters extracted from a fully parallel simulation (p = N), and validated by measuring the speedup as the number of processors (p) ranges from 1 to the number of target nodes (N). Together with simple cost models, the performance model indicates that for target system sizes of 32 nodes and larger, parallel simulation is more cost-effective than sequential simulation. The key intuition behind this result is that large simulations require large memories, which dominate the cost of a uniprocessor; parallel computers allow multiple processors to simultaneously access this large memory.
James Gonzalo King, Pramod Shivaji Kumbhar, Iain Hepburn, Weiliang Chen, Tristan Mathieu Carel, Alessandro Cattabiani, Nicola Cantarutti, Omar Awile, Christos Kotsalos, Samuel Marie A Melchior, Baudouin Paul Michel Maria Joseph Del Marmol, Giacomo Castiglioni