I am interested in research for statistical and machine learning methods for useful applications. I am particularly interested in developing pipelines and frameworks for end-to-end use of these methods from data collection to method evaluation. My research has covered many different applications including LLMs for government applications, infectious diseases, viruses, 5G+ networks, deepfakes, and US census records.
As a Machine Learning Research Scientist and Team Lead at the Software Engineering Institute (SEI) in the AI Division, I work on applied statistical and machine learning problems with focus on development and assessment of Large Language Models (LLMs) and AI assurance. I started at SEI in 2021 in CERT as a data scientist where I worked with modeling of deepfake detection problems and using AI in 5G+ networks.
Previously, I was a post-doc at the National Institute of Allergy and Infectious Disease with Dean Follmann in the Biostatistics Research Branch. I received my PhD in Statistics from the Department of Statistics & Data Science at Carnegie Mellon University in July 2019. My advisor was Bill Eddy, and my dissertation work explored the statistical relationship between two classes of infectious disease models: compartment (CM) and agent-based models (AM). We then applied the results of statistical relationship to analyze hypothetical scenarios in a hybrid for Measles and Ebola. You can find my dissertation here.
Dean Follmann and I have analyzed transmission of Tuberculosis among small clusters of people in the US using our R package InfectionTrees.
Ben LeRoy and I worked on EpiCompare.
“Exploring the nuances of R0: Eight estimates and application to 2009 pandemic influenza.” Gallagher, S., Chang, A., and Eddy, W.F. Submitted, March 2020. Pre-print available:https://arxiv.org/abs/2003.10442
“Evaluating the efficacy of AMA1-RON2, RH5, RIPR and CyRPA antibody combinations in inhibiting growth of P. falciparum.” Azasi, Y.†, Gallagher, S.†, [and 11 others including Fay, Michael P., Miura, K., and Miller, Louis H.] († co-first author). Submitted, Scientific Reports 2020.
"Time free analysis of epidemic models via improved confidence bands, ternary plot visualization, and individual structure." Gallagher, S. and Leroy B. P. In prep., 2021.
"Examining the spread of Tuberculosis via covariate-based branching processes." Gallagher, S. and Follmann, D. Submitted, 2020.
SPEW: Synthetic Populations and Ecosystems of the World. Gallagher, S., Richardson, L. F., Ventura, S.L., and Eddy, W.F.. Journal of Computational and Graphical Statistics, 2018.
"Opening up the court (surface) in tennis grand slams". Gallagher, S., Frisoli, K., and Luby, A. In revision, 2020
Comparing compartment and agent-based models (Proposal document; 10/2017). Advisor: Bill Eddy. Committee: Joel Greenhouse, Howard Seltman, and Sam Ventura.
Prediction Fever: Modeling Influenza with Regional Effects (ADA Final Report; 2/2016). Joint work with Ryan Tibshirani, Roni Rosenfeld, and Bill Eddy.
A brief survey of statistical models to analyze the transmission of infectious diseases. (Guest Lecture. George Washington University. 2/2020 Washington DC.).
Catalyst: agents of change. (Dissertation Defense. 7/2019 Pittsburgh, PA.).
Opening up the (court) surface in tennis grand slams (Carnegie Mellon University Sports Analytics Conference. 10/2018, Pittsburgh, PA). Honorable Mention Presentation . Joint work with Kayla Frisoli and Amanda Luby.
Catalyst: agents of change (Joint Statistical Meetings. 8/2018, Vancouver, BC). Advisor: Bill Eddy. Committee: Joel Greenhouse, Howard Seltman, and Sam Ventura.
Comparing compartment and agent-based models (Joint Statistical Meetings. 8/2017, Baltimore, MD). Advisor: Bill Eddy. Committee: Joel Greenhouse, Howard Seltman, and Sam Ventura.
SPEW: Synthetic Populations and Ecosystems of the World (International Conference on Synthetic Populations. 2/2017, Lucca, Italy). Invited presentation. Joint work with Lee Richardson, Samuel Ventura, and Bill Eddy.
Generating Synthetic Ecosystems: A Tutorial (International Conference on Synthetic Populations. 2/2017, Lucca, Italy). Invited presentation. Joint work with Lee Richardson, Samuel Ventura, and Bill Eddy.
Women in Statistics at Carnegie Mellon University (Women in Statistics and Data Science. 10/2016, Charlotte, NC). Joint work with Purvasha Chakravarti.
Statistical Modelling of Infectious Diseases: Influenza and the “Next Disease” (Women in Statistics and Data Science. 10/2016, Charlotte, NC). Joint work with Ryan Tibshirani, Roni Rosenfeld, Bill Eddy, Sam Ventura, and Lee Richardson
Services for the MIDAS Network: Visualization and Synthetic Ecosystems (MIDAS, 5/2016, Washington D.C.). Joint work with Sam Ventura, and Lee Richardson.
From Forecasting the Flu to Predicting the “Next” Disease (UP-Stat; 4/2016; Buffalo, NY). 2nd place student presentation. Joint work with Ryan Tibshirani, Roni Rosenfeld, Bill Eddy, Sam Ventura, and Lee Richardson
Prediction Fever: Modeling Influenza with Regional Effects (ADA; 12/2016; Pittsburgh, PA). Joint work with Ryan Tibshirani, Roni Rosenfeld, and Bill Eddy.
As a part of the MIDAS Informatics Services Group, I am producing tools and analyzing data for infectious disease modeling. Specifically, we have made SPEW, an R
package to create agents for use in agent-based models.
I am a member of the CMU Sports Analytics Graduate Student group. Most recently, my collaborators and I won an Honorable Mention for our submission for CMSAC Reproducible Research Competition.
For my first-second year Advanced Data Analysis Project, I predicted the incidence of the flu using an Empirical Bayes model with regional dependencies as a part of CMU DELPHI.
As an undergraduate at CMU, I worked with the NSF Census Research Network on a project about record linkage of multiple databases.
EpiCompare R package to aid in the comparison and analysis of infectious disease models via improved confidence bands, ternary plots, and individual-to-aggregate functions.
Loewe's Additivity (Shiny) Shiny App to help model, assess, and determine the additivity of two antibody pairs.
Loewe's Additivity (R). R package to help model, assess, and determine the additivity of two antibody pairs.
Catalyst: Compartment and agent-based models temporal analysis and testing (Catalyst). A software suite for easy analysis of compartment and agent-based models under varying assumptions of population heterogeneity, disease conditions, environmental conditions, and agent features.
spew: An R package for synthetic ecosystem generation. Lee Richardson, Shannon Gallagher, Sam Ventura, and Bill Eddy.
Check out our visualization app, SPEW View, which won first prize at the MIDAS Mission Public Health Hackathon in 2016.