Work Experience
Machine Learning Engineer at SAIC
May 2024 - Current
- Engineered a collision prediction system analyzing flight trajectory data with Python and PostgreSQL, implementing automated alerts with avoidance recommendations.
- Led architecture and migration of an AI research department demo dashboard for national conferences, reducing cloud computing costs by $2,000 monthly.
- Enhanced RAG chatbot system using AWS Bedrock in air-gapped environments, optimizing retrieval mechanisms and improving security compliance.
- Developed backend validation methods using structured formats and LLMs, increasing model accuracy by 20% and improving edge case handling.
- Optimized cage code validation process through web scraping and algorithmic improvements, reducing runtime by 85%.
- Implemented monitoring systems for distributed training jobs, reducing issue resolution time by 50% and improving system reliability.
Research Assistant at Stanford University
August 2023 - May 2024
- Collaborated on large-scale medical data analysis, processing 1.5TB+ multi-modal datasets using advanced Python libraries (NumPy, Pandas, SciPy).
- Applied Bayesian and causal inference models to heart health studies, contributing key insights to a Nature paper submission.
- Implemented parallel processing workflows, reducing model training time by 40% and enabling larger dataset ingestion.
Research Contributor at GAEIA
March 2024 - Present
- Part of the 2024 cohort of the Global Alliance on Ethics and Impact of Advanced Technologies founded by Stanford University, interacting in monthly sessions with industry professionals and PhD students on present-day Ethical AI Issues and Cases.
- Collaborating with the UNHCR to propose AI tools for humanitarian aid, developing strategic approaches to leverage machine learning for refugee support systems and crisis response.
Software Engineer Intern at Spectrum
May 2023 - August 2023
- Architected and implemented a CI/CD pipeline using Docker and Kubernetes, streamlining development environment updates for 200+ engineers.
- Optimized system performance through containerization and orchestration, reducing environment update time from 10+ hours to 5 minutes.
Projects and Publications
The Observed Availability of Data and Code in Earth Science and Artificial Intelligence
- Published research in the Bulletin of the American Meteorological Society (BAMS) examining the accessibility of scientific data and code across Earth Science and AI journals, with a focus on reproducibility and innovation in scientific research.
- Analyzed data availability statements across multiple journals, finding that roughly 75% of articles with availability statements made at least some data publicly available, while code availability was less frequent in three out of four journals examined.
- Identified key barriers to open science, including dataset size limitations and restrictions from non-co-author entities, contributing to the broader discussion on scientific reproducibility.
Memory Hierarchy and Loop Optimization in PDN
- Explored code optimization techniques focusing on memory hierarchy and data layout. Implemented various loop transformations including Unswitching, Splitting, Fission, and Interchange to improve computational efficiency.
- Analyzed and implemented different forms of parallelism (SIMD, OpenMP, MPI, and Instruction Level Parallelism) on both local and supercomputer environments, demonstrating significant performance gains through combined parallelization strategies.
HackAnalyzer
- Created an AI-driven platform to help hackathon judges and participants assess project originality and impact, winning 2nd Place Overall and Most Creative Use of GitHub at HackHarvard 2023.
- Led prompt engineering efforts and UI/UX design, integrating OpenAI's API to analyze Devpost project descriptions and generate meaningful insights about project uniqueness and innovation potential.
- Developed the project architecture and frontend components using React and Next.js, implementing features for project similarity analysis and automated metrics generation.