Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In this thesis, we investigate the inverse problem of trees and barcodes from a combinatorial, geometric, probabilistic and statistical point of view.Computing the persistent homology of a merge tree yields a barcode B. Reconstructing a tree from B involves gluing the branches back together. We are able to define combinatorial equivalence classes of merge trees and barcodes that allow us to completely solve this inverse problem. A barcode can be associated with an element in the symmetric group, and the number of trees with the same barcode, the tree realization number, depends only on the permutation type. We compare these combinatorial definitions of barcodes and trees to those of phylogenetic trees, thus describing the subtle differences between these spaces. The result is a clear combinatorial distinction between the phylogenetic tree space and the merge tree space.The representation of a barcode by a permutation not only gives a formula for the tree realization number, but also opens the door to deeper connections between inverse problems in topological data analysis, group theory, and combinatorics.Based on the combinatorial classes of barcodes, we construct a stratification of the barcode space. We define coordinates that partition the space of barcodes into regions indexed by the averages and the standard deviations of birth and death times and by the permutation type of a barcode. By associating to a barcode the coordinates of its region, we define a new invariant of barcodes.These equivalence classes define a stratification of the space of barcodes with n bars where the strata are indexed by the symmetric group on n letters and its parabolic subgroups.We study the realization numbers computed from barcodes with uniform permutation type (i.e., drawn from the uniform distribution on the symmetric group) and establish a fundamental null hypothesis for this invariant. We show that the tree realization number can be used as a statistic to distinguish distributions of trees by comparing neuronal trees to random barcode distributions.
Kathryn Hess Bellwald, Lida Kanari, Adélie Eliane Garin