Research

Introduction

I’m a statistics student at California Polytechnic State University, San Luis Obispo. I’m graduating a year early, in 2026, with a Cross Disciplinary Studies Minor in Data Science & a Minor in Mathematics. The Data Science CDSM is ~80 units so it’s a major “minor”. I’ve received a major research scholarship from the college and am basically fully funded till graduation. It’s nice since it allows me to extensively conduct research. I’m one of three freshman who have been accepted in the history of the scholarship. As a result, I work extensively with Dr. Chance on various projects. I’ve also received a some other small funding/awards/honors (e.g. small scholarships, funding for travel, President’s list, etc.)

Some of my favorite (current or completed) classes: Applied Stochastic Processes, Independent Study: Measure Theoretic Probability, Linear Algebra II/III, Independent Study: Advanced R, Object Oriented Programming, Advanced Statistical Computing in R.

I’m more interested in research though.

Work

Brief descriptions of some of the work I’m doing right now, and some old work. Not exhaustive. Not ordered.

I’m looking at how skewness affects the sample size requirement of the Central Limit Theorem. We’ve got a paper in review–you can find a preprint here. This is the first paper I’ve written, it was a blast to write. Used large scale simulation also here; I learnt more about how to use Julia. If I was rewriting the code, I might’ve converted to pure Julia and tried TidierPlots.jl. See this work at JSM 2025.

A group of educator across the country and I are looking at students’ curiosity in introductory stats classes. Dr Chance and I are also specifically looking at the effects of teaching modelling in an introductory stats class. Lots of cool things we’re doing, but all under wraps for now! See our work at USCOTS 2025 and JSM 2025.

I’m working on a Quarto extension to allow educators to highlight parts of code blocks. The idea is a continuation of Dr. Bodwin’s R package flair, but the implementation requires a full rewrite. We’re doing cool stuff in Lua and having to deal with Pandoc/Quarto and specifically Quarto’s rendering process. I’m learning more about how Quarto docs get compiled and its very exciting. Hope to publish more information about this soon! See this work at useR 2025.

I’m also looking at how Affinity Propagation (AP), a clustering algorithm, behaves under various circumstances. We’re evaluating AP by fitting it to various synthetic datasets and varying the data generation as well as AP’s parameters. We’re using Monte Carlo subsampling to evaluate clustering performance. Data generation is done at scale using Stanford’s Sherlock cluster through an automated pipeline. Paper in the works.

I’m in a lab where we’re analyzing mutations in non-coding areas of the genome to understand how they affect gene expression and are related to disease. This is another time I leveraged Sherlock to do large scale analysis. HPC is very cool, and I’ve learnt a lot using Sherlock and Slurm.

I was previously in a group broadly looking at uses of AI to bolster Search and Rescue operations. I cleaned some gnarly data, then moved on to setup a framework for generated data validation (there’s a lot more to say here.) We presented our work at two search and rescue conferences. I swapped groups to focus on interpretable machine learning where I worked with a student on their Master’s thesis, developing methods to understand black box models.

Reach out for more details.

Reply to this post by email