Research
Investigating the Foundations of Learning Algorithms
My research is centered on understanding the fundamental limits and capabilities of machine learning algorithms.
View All PublicationsCore Research Topics
Central Limit Theorem for Stochastic Optimization
Stochastic optimization is a cornerstone of modern machine learning. Understanding its convergence behavior is key to both theory and practice. My research proves non-asymptotic Central Limit Theorems (CLTs) that describe how the output of stochastic algorithms fluctuates around the optimum. These results enable principled approaches to hyperparameter tuning and uncertainty quantification in learning systems.
Relevant Publications
This paper establishes the first non-asymptotic CLT for Polyak-Ruppert averaging in two-time-scale stochastic approximation, providing precise error characterizations.
Exact Error Exponents for Nonlinear Stochastic Approximation
In Progress
Applications
- Stochastic Gradient Descent (SGD)
Stochastic Gradient Descent (SGD) is a widely used algorithm for training machine learning models. Its iterative, noise-driven updates lead to sample trajectories that fluctuate as they descend the loss landscape. These trajectories may converge to the optimal solution at varying rates, depending on initialization, noise, and hyperparameters. Non-asymptotic Central Limit Theorems provides a probabilistic description of these fluctuations, quantifying the likelihood of deviations from the optimum within finite time.
Figure 1: Sample trajectory of SGD on a loss landscape. Randomness in the initialization and gradient estimate influences convergence behavior. The spread of such trajectories is characterized by a non-asymptotic Central Limit Theorem, which quantifies their probabilistic fluctuations around the optimal solution. - Reinforcement Learning
In reinforcement learning, algorithms like Temporal Difference (TD) learning estimate value functions from noisy, sequential data. When combined with function approximation, these updates form a stochastic optimization process whose convergence behavior is often hard to analyze. The non-asymptotic Central Limit Theorem offers a way to understand the distribution of TD iterates around the true value function, enabling finite-sample guarantees and uncertainty-aware learning in complex environments.
Figure 2: (Left) A simple grid world environment. An agent learns the value function by traversing paths under stochastic transitions. (Right) Histogram of the error in the estimated value function. The non-asymptotic CLT captures the shape and spread of this distribution, enabling quantitative analysis of learning variability and confidence in the estimates.
Sampling Error Bounds for Diffusion Models
Diffusion models are used to sample from complex distributions, such as those in generative modeling. My research focuses on deriving error bounds for the sampling process of diffusion models. These bounds quantify the deviation of sampled outputs from the true distribution, enabling better understanding and control over the sampling quality.
Relevant Publications
Sampling Error and Score Matching for Diffusion Models
In Progress
This paper derives sharp error bounds for diffusion model sampling, quantifying the deviation of sampled outputs from the true distribution.
Applications
- Generative Modeling
Diffusion models are a class of generative models that learn to sample from complex data distributions. They iteratively refine random noise into structured outputs, such as images or text. Understanding the sampling error is crucial for ensuring high-quality outputs and controlling model behavior.
Figure 3: Diffusion model sampling process. A deep learning model is used to learn the parameters of a stochastic differential equation, and gradually transforms noise into a realistic image.
Crowdsourcing and Label Aggregation
Crowdsourcing algorithms are used to aggregate labels from multiple workers to infer the true underlying labels. My research focuses on a method to cluster tasks by difficulty, which addresses the common problem where worker reliability changes depending on the task's complexity. This approach enables aggregation models like Dawid-Skene to better estimate worker reliability on a per-difficulty basis, improving the overall accuracy of the final labels.
Relevant Publications
Spectral Clustering for Crowdsourcing with Inherently Distinct Task Types
Under Review
*Denotes equal contribution.
Applications
- Noisy Labels in Machine Learning
This research was motivated by my experience working with medical images, where the inherent difficulty of labeling radiology data often leads to noisy annotations from experts. In such applications, the proposed difficulty-clustering method can be used to generate a more reliable ground truth from multiple conflicting labels. This ultimately improves the quality of the data used for training and evaluating diagnostic AI models, enhancing their performance and robustness.
Industry & Applied Research
Amazon
During my 12 months as an Applied Scientist Intern, I developed and validated machine learning models for key business problems. My work spanned two areas: developing Large Language Models (LLMs) to improve semantic search ranking, and architecting predictive models from user behavioral signals to solve critical data sparsity issues in product quality analysis.
Description
Machine Learning Pipeline
I employ a versatile modeling strategy, utilizing both custom deep learning (PyTorch, HuggingFace) and AutoML (AutoGluon) before rigorous offline benchmarking. This end-to-end process ensures the final deliverables are not only powerful but also robustly validated against business objectives.
VUNO Inc.
In my three years at VUNO, I evolved from a Researcher to a Research Team Lead, solving problems that arise when developing deep learning models for medical imaging. I led research on ML methodologies that resulted in first-author publications at premier AI venues (NeurIPS, AAAI) and contributed to the analysis of models across various modalities.
Description
Detecting abnormalities in Chest X-rays
Many diseases, such as tumors or infections, manifest as subtle visual patterns in chest X-ray images including localized shadows, nodules, or irregular textures. Deep learning models can be trained to recognize these abnormalities. By detecting and localizing such patterns, these models help scale diagnostic support in medical imaging.

Relevant Publications
Developed an active learning algorithm to best select data to be labelled.
Proposed a method to train a neural network for out-of-distribution detection.
*Denotes equal contribution.
Developed a self-supervised learning method for arrhythmia detection using ECG data.
*Denotes equal contribution.