**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# A DNA Coarse-Grain Rigid Base Model and Parameter Estimation from Molecular Dynamics Simulations

Résumé

Sequence dependent mechanics of DNA is believed to play a central role in the functioning of the cell through the expression of genetic information. Nucleosome positioning, gene regulation, DNA looping and packaging within the cell are only some of the processes that are believed to be at least partially governed by mechanical laws. Therefore it is important to understand how the sequence of DNA affects its mechanical properties. For exploring the mechanical properties of DNA, various discrete and continuum models have been, and continue to be, developed. A large family of these models, including the model considered in this work, assume that bases or base pairs of DNA are rigid bodies. The most standard are rigid base pair models, with parameters either obtained directly from experimental data or from Molecular Dynamics (MD) simulations. The drawback of current experimental data, such as crystal structures, is that only small ensembles of configurations are available for a small number of sequences. In contrast, MD simulations allow a much more detailed view of a larger number of DNA sequences. However, the drawback is that the results of these simulations depend on the choice of the simulation protocol and force field parameters. MD simulations also have sequence length limitations and are currently too intensive for (linear) molecules longer than a few tens of base pairs. The only way to simulate longer sequences is to construct a coarse-grain model. The goal of this work is to construct a small parameter set that can model a sequence- dependent equilibrium probability distribution for rigid base configurations of a DNA oligomer with any given sequence of any length. The model parameter sets previously available were for rigid base pair models ignoring all the couplings beyond nearest neighbour interactions. However it was shown in previous work, that this standard model of rigid base pair nearest neighbour interactions is inconsistent with a (then) large scale MD simulation of a single oligomer [36]. In contrast we here show that a rigid base nearest neighbour, dimer sequence dependent model is a quite good fit to many MD simulations of different duration and se- quence. In fact a hierarchy of rigid base models with different interaction range and length of sequence-dependence is discussed, and it is concluded that the nearest neighbour, dimer based model is a good compromise between accuracy and complexity of the model. A full parameter set for this model is estimated. An interesting feature is that despite the dimer dependence of the parameter set, due to the phenomenon of frustration, our model predicts non local changes in the oligomer shape as a function of local changes in the sequence, down to the level of a point mutation.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (11)

Publications associées (27)

Dynamique moléculaire

La dynamique moléculaire est une technique de simulation numérique permettant de modéliser l'évolution d'un système de particules au cours du temps. Elle est particulièrement utilisée en sciences de

Suite (mathématiques)

vignette|Exemple de suite : les points bleus représentent ses termes.
En mathématiques, une suite est une famille d'éléments — appelés ses « termes » — indexée par les entiers naturels. Une suite fini

Simulation de phénomènes

La simulation de phénomènes est un outil utilisé dans le domaine de la recherche et du développement. Elle permet d'étudier les réactions d'un système à différentes contraintes pour en déduire les r

Chargement

Chargement

Chargement

John Maddocks, Marco Pasi, Daiva Petkeviciute

cgDNA is a package for the prediction of sequence-dependent configuration-space free energies for B-form DNA at the coarse-grain level of rigid bases. For a fragment of any given length and sequence, cgDNA calculates the configuration of the associated free energy minimizer, i.e. the relative positions and orientations of each base, along with a stiffness matrix, which together govern differences in free energies. The model predicts non-local (i.e. beyond base-pair step) sequence dependence of the free energy minimizer. Configurations can be input or output in either the Curves+definition of the usual helical DNA structural variables, or as a PDB file of coordinates of base atoms. We illustrate the cgDNA package by comparing predictions of free energy minimizers from (a) the cgDNA model, (b) time-averaged atomistic molecular dynamics (or MD) simulations, and (c) NMR or Xray experimental observation, for (i) the Dickerson-Drew dodecamer and (ii) three oligomers containing A-tracts. The cgDNA predictions are rather close to those of the MD simulations, but many orders of magnitude faster to compute. Both the cgDNA and MD predictions are in reasonable agreement with the available experimental data. Our conclusion is that cgDNA can serve as a highly efficient tool for studying structural variations in B-form DNA over a wide range of sequences.

John Maddocks, Daiva Petkeviciute

A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can successfully predict the nonlocal changes in the minimum energy configuration of an oligomer that are consequent upon a local change of sequence at the level of a single point mutation. (C) 2013 American Institute of Physics. [http://dx.doi.org/10.1063/1.4789411]

We introduce a sequence-dependent coarse-grain model of double-stranded DNA with an explicit description of both the bases and the phosphate groups as interacting rigid-bodies. The model parameters are trained on extensive, state-of-the-art large scale molecular dynamics (MD) simulations. The model paradigm relies on three main approximations: 1) nucleic acid bases and phosphate groups are rigid, 2) interactions are nearest-neighbour and can be modelled with a quadratic energy, 3) model parameters have dimer sequence dependence.
For an arbitrary sequence, the model predicts a sequence-dependent Gaussian equilibrium probability distribution. The parameter set comprises dimer-based elements, which are used to reconstruct mean configurations, called ground-states, which can have strong non-local sequence dependence, and precision matrices, or stiffness matrices, for any sequence of any length. This prediction step is sufficiently efficient that it is straightforward to construct probability density functions for millions of fragments each of length a few hundred base-pairs. The estimation of a parameter set consists in minimising the sum of Kullback-Leibler divergences between Gaussians predicted by the model and analogous Gaussians estimated directly from MD simulations of a training library of sequences. The training library comprises a short list of short palindromic DNA sequences. We designed the palindromic library using an ad hoc algorithm to include multiple instances of all independent tetramer sub-sequences. We exploit palindromic symmetry properties to study the convergence of the statistics extracted from MD simulations of palindromes and to define palindromically symmetrised estimators of first and second centred moments. The computation of the parameter set is delicate and needs the use of sophisticated numerics. We present an efficient and reliable procedure for estimating a complete parameter set which involves a generalisation of the classic Fisher information matrix and its relationship to the relative entropy, or Kullback-Leibler divergence. The model is a computationally efficient tool that allows the study of the mechanical properties of double-stranded DNA of arbitrary length and sequence. We use the model to study the sequence-dependent rigidity of DNA and we compute sequence-dependent apparent and dynamic persistence lengths. The explicit treatment of the phosphate group also allows computation of sequence-dependent grooves widths. Moreover, with fine-grained representation of predicted ground-states, we can also study sequence-dependence of sugar puckering modes and BI-BII backbone conformations.