Analyzing Survey Data: Using Classification Results for Better Data Understanding
This research, supported by the Worcester State University STEM Summer research program, aimed to dissect, and understand youth risk behaviors from survey data obtained across multiple states and districts in the USA from 1991 to 2019. With the voluminous nature of the complete dataset, our attention was specifically narrowed to high school students within the New England region, excluding Massachusetts due to data unavailability.
In preparation for an in-depth analysis, our initial task involved a comprehensive data cleaning and preprocessing phase. This was essential to ensure the accuracy and reliability of our findings. We proceeded to thoroughly analyze the dataset, employing an array of visualization techniques, statistical analysis methods, and advanced machine learning algorithms to uncover and interpret the patterns of youth risk behavior.
Our investigative lens focused on potential inequalities in risk behaviors across various racial and age groups. We were particularly interested in uncovering any correlations between different patterns of risk behavior. To achieve this, we utilized diverse classification methods, each with distinct testing modalities.
Despite some models demonstrating lower accuracy, they proved to be effective in revealing significant patterns within the data. Our project illustrates the potential of survey data analysis in undergraduate research, especially within interdisciplinary fields such as Health, Sociology, and Criminal Justice. Such analyses can be scaled and integrated into data analysis curricula, encouraging students to learn and apply different data visualization techniques, statistical analysis methods, and machine learning approaches.
Research Area | Presenter | Title | Keywords |
---|---|---|---|
Probability, Statistics, and Machine Learning | Waghe, Shreyas | Fair Machine Learning (0.896552), Machine Learning (1.0) | |
Computer Science | Jung, Hayun | Machine Learning | |
Computer Science | Baron, Cameron Shea | Machine Learning | |
Artificial Intelligence | Landaverde, Yeilin M. | Machine Learning | |
Cancer Studies | Rasku-Casas, Isabella | Machine Learning |