Machine learning for the study of Parkinson’s Disease diagnosis and associated mechanisms.
The underlying biological mechanisms associated with the origin and progression of
Parkinson’s Disease (PD) are still unclear and its diagnosis remains a challenge even after the onset of
motor symptoms. In this context, we aimed to identify molecular and genetic fingerprints associated
with motor-status PD from the analysis and modelling of blood-based omics data.
We applied several machine learning classification methods to investigate the predictive potential of
higher order functional representations of omics data from PD case/control studies. These higher
order functional representations were generated by summarizing omics abundance information into
global cellular pathways, cell compartments and protein complexes via aggregation statistics and
dimension-reduction deregulation scores. The models’ performance and most relevant predictive
features were compared with individual feature level predictors. The resulting diagnostic models
from metabolomics’ individual features and pathway deregulation scores achieve significant Area
Under the Curve (AUC, a receiver operating characteristic curve) scores for both cross-validation and
external testing. Furthermore, we identify plausible biological pathways associated with PD diagnosis
such as xanthine metabolism.
We have successfully built machine learning models at global cellular level and single-feature level to
study blood-based omics data for PD diagnosis. Our results not only reveal plausible biological
pathway associations, but also that metabolomics pathway deregulation scores can serve as robust
and biologically interpretable predictors for PD.
Topic: Neurodegenerative disease, computational biology.
Keywords: Machine learning, Parkinson’s Disease, pathways, omics, metabolomics, xanthine