Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In this thesis we investigate different ways of approximating the solution of the chemical master equation (CME). The CME is a system of differential equations that models the stochastic transient behaviour of biochemical reaction networks. It does so by describing the time evolution of probability distribution over the states of a Markov chain that represents a biological network, and thus its stochasticity is only implicit. The transient solution of a CME is the vector of probabilities over the states of the corresponding Markov chain at a certain time point t, and it has traditionally been obtained by applying methods that are general to continuous-time Markov chains: uniformization, Krylov subspace methods, and general ordinary differential equation (ODE) solvers such as the fourth order Runge-Kutta method. Even though biochemical reaction networks are the main application of our work, some of our results are presented in the more general framework of propagation models (PM), a computational formalism that we introduce in the first part of this thesis. Each propagation model N has two associated propagation processes, one in discrete-time and a second one in continuous-time. These propagation processes propagate a generic mass through a discrete state space. For example, in order to model a CME, N propagates probability mass. In the discrete-time case the propagation is done step-wise, while in the continuous-time case it is done in a continuous flow defined by a differential equation. Again, in the case of the chemical master equation, this differential equation is the equivalent of the chemical master equation itself where probability mass is propagated through a discrete state space. Discrete-time propagation processes can encode methods such as the uniformization method and the fourth order Runge-Kutta integration method that we have mentioned above, and thus by optimizing propagation algorithms we optimize both of these methods simultaneously. In the second part of our thesis, we define stochastic hybrid models that approximate the stochastic behaviour of biochemical reaction networks by treating some variables of the system deterministically. This deterministic approximation is done for species with large populations, for which stochasticity does not play an important role. We propose three such hybrid models, which we introduce from the coarsest to the most refined one: (i) the first one replaces some variables of the system with their overall expectations, (ii) the second one replaces some variables of the system with their expectations conditioned on the values of the stochastic variables, (iii) and finally, the third one, splits each variable into a stochastic part (for low valuations) and a deterministic part (for high valuations), while tracking the conditional expectation of the deterministic part. For each of these algorithms we give the corresponding propagation models that propagate not only probabilities but also the respective continuous approximations for the deterministic variables.
Simone Deparis, Riccardo Tenderini, Nicholas Mueller