Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This paper presents a novel simulation approach for generating synthetic households, addressing several literature gaps from the methodological viewpoint. The generation of hierarchical datasets such as complete households is challenging since it must guarantee replication of the marginal distributions of each attribute while maintaining the consistency between the layer of individuals and the layer of households. Usually, these layers are generated in two sequential processes. This paper focuses on designing a one-step simulator that simultaneously integrates the relationships within both layers. One of the major advantages is that it reduces the risk of generating illogical households. In order to deal with the curse of the dimensionality of the simulation method, we propose a so-called divideand-conquer way of modeling, that simplifies the problem by reducing the number of variables so that we maintain the best trade-off between the accuracy and efficiency of the generation process. We test our method in a case study based on the 2015 Swiss census data, where we compare our method with state-of-the-art approaches. The results suggest that we can achieve twice as fast household generation by preserving the same accuracy compared to other simulation methods
, , ,