Research
Investigating the Foundations of Learning Algorithms
My research is centered on understanding the fundamental limits and capabilities of machine learning algorithms.
View All PublicationsCore Research Topics
Central Limit Theorem for Stochastic Optimization
Relevant Publications
This paper establishes the first non-asymptotic CLT for Polyak-Ruppert averaging in two-time-scale stochastic approximation, providing precise error characterizations.
Sampling Error Bounds for Diffusion Models
Relevant Publications
Sampling Error and Score Matching for Diffusion Models
In Progress
This paper derives sharp error bounds for diffusion model sampling, quantifying the deviation of sampled outputs from the true distribution.
Crowdsourcing and Label Aggregation
Relevant Publications
Spectral Clustering for Crowdsourcing with Inherently Distinct Task Types
Under Review
*Denotes equal contribution.
Industry & Applied Research
Amazon
- Search Relevance (Palo Alto, CA): Fine-tuned a listwise LLM for search reranking using PyTorch + HuggingFace on AWS SageMaker (multi-GPU).
- Impact: Significantly improved ranking in nDCG@1.
- Product Quality (Seattle, WA): Developed a predictive GBDT/NN ensemble model for pre-purchase customer satisfaction using Scala Spark, Polars, and AutoGluon.
- Impact: Generated dense defect probability scores to address log sparsity, enabling downstream applications in search ranking and fault attribution.
Description
VUNO Inc.
- Developed deep learning architectures for Chest X-ray and other medical imaging analysis using PyTorch.
- Designed an active learning algorithm that significantly reduced labeling costs (published at NeurIPS 2022).
- Implemented out-of-distribution (OOD) detection modules for production diagnostic software to prevent silent failures on anomalous data (published at AAAI).
Relevant Publications
Developed an active learning algorithm to best select data to be labelled.
Proposed a method to train a neural network for out-of-distribution detection.
*Denotes equal contribution.
Developed a self-supervised learning method for arrhythmia detection using ECG data.
*Denotes equal contribution.
</div>
