Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The creation of high fidelity synthetic data has long been an important goal in machine learning, particularly in fields like finance where the lack of available training and test data make it impossible to utilize many of the deep learning techniques which have proven so powerful in other domains. Despite ample research into different types of synthetic generation techniques, which in recent years have largely focused on generative adversarial networks, there remain key holes in many of the architectures and techniques being utilized. In particular, there are currently no techniques available which can generate multiple series concurrently while capturing the specific stylized facts of financial time series and which incorporate extra information that effect the series such as macroeconomic factors. In this thesis, we propose the Conditional Market Transformer-Encoder Generative Adversarial Network (C-MTE-GAN), a novel generative adversarial neural network architecture that satisfies the aforementioned challenges. C-MTE-GAN is able to capture the relevant univariate stylized facts such as lack of autocorrelation of returns, volatility clustering, fat tails, and the leverage effect. It is also able to capture the multivariate interactions between multiple concurrently generated series such as correlation and tail dependence. Lastly, we are able to condition the generated series both on a prior series of returns as well as on different types of relevant information that typically effect both the characteristics of the market and factor into asset allocation decision making. Furthermore, we demonstrate the effectiveness of data generated by C-MTE-GAN to augment training of a statistical arbitrage model and improve its performance in realistic portfolio allocation scenarios. The abilities of this architecture represent a substantial step forward in financial time series generation which will hopefully unlock many new applications of synthetic data within the realm of finance.
Francesco Mondada, Alexandre Massoud Alahi, Vaios Papaspyros
,