Publication

Statistical inference in ensemble modeling of cellular metabolism

Abstract

Kinetic models of metabolism can be constructed to predict cellular regulation and devise metabolic engineering strategies, and various promising computational workflows have been developed in recent years for this. Due to the uncertainty in the kinetic parameter values required to build kinetic models, these workflows rely on ensemble modeling (EM) principles for sampling and building populations of models describing observed physiologies. Sensitivity coefficients from metabolic control analysis (MCA) of kinetic models can provide important insight about cellular control around a given physiological steady state. However, despite considering populations of kinetic models and their model outputs, current approaches do not provide adequate tools for statistical inference. To derive conclusions from model outputs, such as MCA sensitivity coefficients, it is necessary to rank/compare populations of variables with each other. Currently existing workflows consider confidence intervals (CIs) that are derived independently for each comparable variable. Hence, it is important to derive simultaneous CIs for the variables that we wish to rank/compare. Herein, we used an existing large-scale kinetic model of Escherichia Coli metabolism to present how univariate CIs can lead to incorrect conclusions, and we present a new workflow that applies three different multivariate statistical approaches. We use the Bonferroni and the exact normal methods to build symmetric CIs using the normality assumptions. We then suggest how bootstrapping can compute asymmetric CIs whilst relaxing this normality assumption. We conclude that the Bonferroni and the exact normal methods can provide simple and efficient ways for constructing reliable CIs, with the exact normal method favored over the Bonferroni when the compared variables present dependencies. Bootstrapping, despite its significantly higher computational cost, is recommended when comparing non-normal distributions of variables. Additionally, we show how the Bonferroni method can readily be used to estimate required sample numbers to attain a certain CI size.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (37)
Statistical assumption
Statistics, like all mathematical disciplines, does not infer valid conclusions from nothing. Inferring interesting conclusions about real statistical populations almost always requires some background assumptions. Those assumptions must be made carefully, because incorrect assumptions can generate wildly inaccurate conclusions. Here are some examples of statistical assumptions: Independence of observations from each other (this assumption is an especially common error). Independence of observational error from potential confounding effects.
Normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is The parameter is the mean or expectation of the distribution (and also its median and mode), while the parameter is its standard deviation. The variance of the distribution is . A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.
Confidence interval
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated confidence level; the 95% confidence level is most common, but other levels, such as 90% or 99%, are sometimes used. The confidence level, degree of confidence or confidence coefficient represents the long-run proportion of CIs (at the given confidence level) that theoretically contain the true value of the parameter; this is tantamount to the nominal coverage probability.
Show more
Related publications (52)

Higher Order Asymptotics: Applications to Satellite Conjunction and Boundary Problems

Soumaya Elkantassi

Higher-order asymptotics provide accurate approximations for use in parametric statistical modelling. In this thesis, we investigate using higher-order approximations in two-specific settings, with a particular emphasis on the tangent exponential model. Th ...
EPFL2023

A note on universal inference

Anthony Christopher Davison, Timmy Rong Tian Tse

Universal inference enables the construction of confidence intervals and tests without regularity conditions by splitting the data into two parts and appealing to Markov's inequality. Previous investigations have shown that the cost of this generality is a ...
WILEY2022

Is There a Cap on Longevity? A Statistical Review

Anthony Christopher Davison

There is sustained and widespread interest in understanding the limit, if there is any, to the human life span. Apart from its intrinsic and biological interest, changes in survival in old age have implications for the sustainability of social security sys ...
ANNUAL REVIEWS2022
Show more
Related MOOCs (11)
Advanced statistical physics
We explore statistical physics in both classical and open quantum systems. Additionally, we will cover probabilistic data analysis that is extremely useful in many applications.
Advanced statistical physics
We explore statistical physics in both classical and open quantum systems. Additionally, we will cover probabilistic data analysis that is extremely useful in many applications.
Water quality and the biogeochemical engine
Learn about how the quality of water is a direct result of complex bio-geo-chemical interactions, and about how to use these processes to mitigate water quality issues.
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.