Deep Dive Into Gaussian Processes and Variable Importance

Presenter
Gabrielle Marie Walczak
Campus
UMass Amherst
Sponsor
Maryclare Griffin, Department of Mathematics and Statistics, UMass Amherst
Schedule
Session 2, 11:30 AM - 12:15 PM [Schedule by Time][Poster Grid for Time/Location]
Location
Poster Board A43, Campus Center Auditorium, Row 3 (A41-A60) [Poster Location Map]
Abstract

When telling stories with data, regression models are helpful to explain trends and predict results for new observations. Oftentimes, linear regression models are used to describe relationships between outcome and predictor variables because there are well-defined methods of measuring a feature variable’s importance and the accuracy of the model as a whole. However, the story a linear model tells may be inaccurate in cases where the outcome has nonlinear relationships with its predictor variables. A nonlinear regression model may be more appropriate and precise, yet these models lack easily defined feature variable importance metrics. In this paper, I will apply a recently developed variable importance operator that measures a variable’s importance on a local level, for each observation, and on a global level, for a population. I will compare the accuracy of simple linear regression, traditional one-layer Gaussian processes, and two-layer Gaussian processes. Using each of the three models, I will measure variable importance in the context of relating different gene expressions in mice (predictors) to their observed traits (outcomes). The results of this study will provide a new perspective on how to eradicate the trade-off between model accuracy and interpretability in terms of measuring variable importance. This has the potential to improve the field of data science and the application towards biomedicine–whether it be a gene expression’s effect on disease progression across a whole population or per individual within specific subpopulations.


Keywords
Variable Importance, Regression, Nonlinear models, Gaussian Process
Research Area
Mathematics and Statistics

SIMILAR ABSTRACTS (BY KEYWORD)

Research Area Presenter Title Keywords
Environmental Science and Sustainability Saunders, Sam Regression Analysis