Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
A simple predictive biomarker for fatty liver disease is required for individuals with insulin resistance. Here, we developed a supervised machine learning-based classifier for fatty liver disease using fecal 16S rDNA sequencing data. Based on the Kangbuk Samsung Hospital cohort (n = 777), we generated a random forest classifier to predict fatty liver diseases in individuals with or without insulin resistance (n = 166 and n = 611, respectively). The model performance was evaluated based on metrics, including accuracy, area under receiver operating curve (AUROC), kappa, and F1-score. The developed classifier for fatty liver diseases performed better in individuals with insulin resistance (AUROC = 0.77). We further optimized the classifiers using genetic algorithm. The improved classifier for insulin resistance, consisting of ten microbial genera, presented an advanced classification (AUROC = 0.93), whereas the improved classifier for insulin-sensitive individuals failed to distinguish participants with fatty liver diseases from the healthy. The classifier for individuals with insulin resistance was comparable or superior to previous methods predicting fatty liver diseases (accuracy = 0.83, kappa = 0.50, F1-score = 0.89), such as the fatty liver index. We identified the ten genera as a core set from the human gut microbiome, which could be a diagnostic biomarker of fatty liver diseases for insulin resistant individuals. Collectively, these findings indicate that the machine learning classifier for fatty liver diseases in the presence of insulin resistance is comparable or superior to commonly used methods.
,
Kristina Schoonjans, Alessia Perino, Hadrien Charles Edouard Demagny