Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
One of the objectives of Pharmacometry (PMX) population modeling is the identification of significant and clinically relevant relationships between parameters and covariates. Here, we demonstrate how this complex selection task could benefit from supervised learning algorithms using importance scores. We compare various classical methods with three machine learning (ML) methods applied to NONMEM empirical Bayes estimates: random forest, neural networks (NNs), and support vector regression (SVR). The performance of the ML models is assessed using receiver operating characteristic (ROC) curves. The F1 score, which measures test accuracy, is used to compare ML and PMX approaches. Methods are applied to different scenarios of covariate influence based on simulated pharmacokinetics data. ML achieved similar or better F1 scores than stepwise covariate modeling (SCM) and conditional sampling for stepwise approach based on correlation tests (COSSAC). Correlations between covariates and the number of false covariates does not affect the performance of any method, but effect size has an impact. Methods are not equivalent with respect to computational speed; SCM is 30 and 100-times slower than NN and SVR, respectively. The results are validated in an additional scenario involving 100 covariates. Taken together, the results indicate that ML methods can greatly increase the efficiency of population covariate model building in the case of large datasets or complex models that require long run-times. This can provide fast initial covariate screening, which can be followed by more conventional PMX approaches to assess the clinical relevance of selected covariates and build the final model.
Florent Gérard Krzakala, Lenka Zdeborová, Hugo Chao Cui