**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# From trees to barcodes and back again II: Combinatorial and probabilistic aspects of a topological inverse problem

Abstract

In this paper we consider two aspects of the inverse problem of how to construct merge trees realizing a given barcode. Much of our investigation exploits a recently discovered connection between the symmetric group and barcodes in general position, based on the simple observation that death order is a permutation of birth order. We show how to lift this combinatorial characterization of barcodes to an analogous combinatorialization of merge trees. As result of this study, we provide the first clear combinatorial distinction between the space of phylogenetic trees (as defined by Billera, Holmes and Vogtmann) and the space of merge trees: generic phylogenetic trees on leaf nodes fall into distinct equivalence classes, but the analogous number for merge trees is equal to the number of maximal chains in the lattice of partitions, i.e., . The second aspect of our study is the derivation of precise formulas for the distribution of tree realization numbers (the number of merge trees realizing a given barcode) when we assume that barcodes are sampled using a uniform distribution on the symmetric group. We are able to characterize some of the higher moments of this distribution, thanks in part to a reformulation of our distribution in terms of Dirichlet convolution. This characterization provides a type of null hypothesis, apparently different from the distributions observed in real neuron data, which opens the door to doing more precise statistics and science.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications (47)

Related concepts (46)

Related MOOCs (2)

Phylogenetic tree

A phylogenetic tree (also phylogeny or evolutionary tree) is a branching diagram or a tree showing the evolutionary relationships among various biological species or other entities based upon similarities and differences in their physical or genetic characteristics. All life on Earth is part of a single phylogenetic tree, indicating common ancestry. In a rooted phylogenetic tree, each node with descendants represents the inferred most recent common ancestor of those descendants, and the edge lengths in some trees may be interpreted as time estimates.

Continuous uniform distribution

In probability theory and statistics, the continuous uniform distributions or rectangular distributions are a family of symmetric probability distributions. Such a distribution describes an experiment where there is an arbitrary outcome that lies between certain bounds. The bounds are defined by the parameters, and which are the minimum and maximum values. The interval can either be closed (i.e. ) or open (i.e. ). Therefore, the distribution is often abbreviated where stands for uniform distribution.

Moment (mathematics)

In mathematics, the moments of a function are certain quantitative measures related to the shape of the function's graph. If the function represents mass density, then the zeroth moment is the total mass, the first moment (normalized by total mass) is the center of mass, and the second moment is the moment of inertia. If the function is a probability distribution, then the first moment is the expected value, the second central moment is the variance, the third standardized moment is the skewness, and the fourth standardized moment is the kurtosis.

The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.

The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.

Frédéric Courbin, Gianluca Castignani, Jean-Luc Starck, Austin Chandler Peel, Maurizio Martinelli, Yi Wang, Richard Massey, Fabio Finelli, Marcello Farina

Recent cosmic shear studies have shown that higher-order statistics (HOS) developed by independent teams now outperform standard two-point estimators in terms of statistical precision thanks to their sensitivity to the non-Gaussian features of large-scale ...

As large, data-driven artificial intelligence models become ubiquitous, guaranteeing high data quality is imperative for constructing models. Crowdsourcing, community sensing, and data filtering have long been the standard approaches to guaranteeing or imp ...

Anne-Florence Raphaëlle Bitbol, Nicola Dietler, Umberto Lupo

Local and global inference methods have been developed to infer structural contacts from multiple sequence alignments of homologous proteins. They rely on correlations in amino acid usage at contacting sites. Because homologous proteins share a common ance ...