TY - CHAP
T1 - Analysis and predictive modeling of asthma phenotypes
AU - Brasier, Allan R.
AU - Ju, Hyunsu
PY - 2014
Y1 - 2014
N2 - Molecular classification using robust biochemical measurements provides a level of diagnostic precision that is unattainable using indirect phenotypic measurements. Multidimensional measurements of proteins, genes, or metabolites (analytes) can identify subtle differences in the pathophysiology of patients with asthma in a way that is not otherwise possible using physiological or clinical assessments. We overview a method for relating biochemical analyte measurements to generate predictive models of discrete (categorical) clinical outcomes, a process referred to as "supervised classification." We consider problems inherent in wide (small n and large p ) high-dimensional data, including the curse of dimensionality, collinearity and lack of information content. We suggest methods for reducing the data to the most informative features. We describe different approaches for phenotypic modeling, using logistic regression, classification and regression trees, random forest and nonparametric regression spline modeling. We provide guidance on post hoc model evaluation and methods to evaluate model performance using ROC curves and generalized additive models. The application of validated predictive models for outcome prediction will significantly impact the clinical management of asthma.
AB - Molecular classification using robust biochemical measurements provides a level of diagnostic precision that is unattainable using indirect phenotypic measurements. Multidimensional measurements of proteins, genes, or metabolites (analytes) can identify subtle differences in the pathophysiology of patients with asthma in a way that is not otherwise possible using physiological or clinical assessments. We overview a method for relating biochemical analyte measurements to generate predictive models of discrete (categorical) clinical outcomes, a process referred to as "supervised classification." We consider problems inherent in wide (small n and large p ) high-dimensional data, including the curse of dimensionality, collinearity and lack of information content. We suggest methods for reducing the data to the most informative features. We describe different approaches for phenotypic modeling, using logistic regression, classification and regression trees, random forest and nonparametric regression spline modeling. We provide guidance on post hoc model evaluation and methods to evaluate model performance using ROC curves and generalized additive models. The application of validated predictive models for outcome prediction will significantly impact the clinical management of asthma.
KW - False discovery rate
KW - Feature reduction
KW - Generalized additive models (GAMs)
KW - Logistic regression
KW - Multivariate adaptive regression splines (MARS)
KW - Multivariate analysis
KW - Random forest
KW - Receiver operating characteristic (ROC) curve
KW - Significance of microarrays (SAM)
KW - Supervised learning
UR - http://www.scopus.com/inward/record.url?scp=84892978633&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84892978633&partnerID=8YFLogxK
U2 - 10.1007/978-1-4614-8603-9-17
DO - 10.1007/978-1-4614-8603-9-17
M3 - Chapter
C2 - 24162915
SN - 9781461486022
VL - 795
T3 - Advances in Experimental Medicine and Biology
SP - 273
EP - 288
BT - Advances in Experimental Medicine and Biology
ER -