CV
Applied Scientist at Microsoft with expertise in ML for information retrieval, computer vision, and NLP. IIT Jodhpur alumnus (2023).
Contact Information
| Name | Akarsh Upadhyay |
| Professional Title | Applied Scientist |
| akarshupadhyayabc@gmail.com |
Professional Summary
Applied Scientist at Microsoft working on ML for ad retrieval and monetization. IIT Jodhpur alumnus with experience in information retrieval, computer vision, and NLP.
Experience
-
2024 - present India
Applied Scientist
Microsoft (MSAN - Microsoft Audience Network)
- Intent-Based Retrieval: Led development of next-generation encoder models for ad retrieval. Achieved +14.32 absolute precision points (P@100) over production baseline using 30x less training data.
- A/B Testing Impact: +0.61% CTR and +0.51% revenue lift in NA region. Methodology adopted as standard by US and India MSAN teams.
- Unified Evaluation Framework: Designed single source-of-truth evaluation framework across IDC, STCA, and US MSAN teams, supporting encoder-based, Single/Multi-Intent, Generative Retrieval, and ANN variants.
- High-Impact Index for Product Ads: ML-driven retrieval system identifying high-impact product offers; replaced legacy system, delivering 2.77% revenue lift.
-
2024 - 2024 India
Machine Learning Engineer
Zomato
- Image Quality Score: Fine-tuned ResNet-50 for food image quality classification (F1: 90%). Now a critical component evaluating nearly every food image shown to users.
- Ads Creation: Designed automated ad creation system using generative models for background generation and brand-specific styling.
- Photo Cake: Built real-time image overlay system using OpenCV; launched on Mother’s Day resulting in 3,000+ photo cake sales across India.
-
2023 - 2023 India
ML Research Intern
Enterpret
- Text Similarity at Scale: Led project to assess semantic similarity between sentences based on business value. Scaled solution to 1M+ texts, achieving F1 Score of 85%.
- Explored prompt engineering, transformer fine-tuning, novel loss functions, and LLMs for optimal results.
- Created comprehensive data preparation guidelines (20+ pages) and supervised the annotation team.
Education
-
2019 - 2023 Jodhpur, India
B.Tech
Indian Institute of Technology (IIT) Jodhpur
Electrical Engineering
- Core Member of the Robotics Club - participated in AI/ML competitions and hackathons.
- Research on Document Layout Understanding under Dr. Santanu Chaudhary.
- Project on Medical Visual Question Answering using NLP + Computer Vision.
Publications
-
2024 GDP: Generic Document Pretraining to Improve Document Understanding
18th International Conference on Document Analysis and Recognition (ICDAR)
A generic document pretraining approach that improves document understanding across various downstream tasks.
Projects
-
Document Layout Understanding
Multi-modal transformer (DocFormer) for Visual Document Understanding (VDU) in English and multilingual settings. Guided by Dr. Santanu Chaudhary, IIT Jodhpur.
- Applied DocFormer architecture for document understanding tasks.
- Extended to multilingual document settings.
-
Medical Visual Question Answering
VQA system for X-ray/MRI scans (brain, kidney, lungs) using NLP and Computer Vision (TensorFlow). Achieved ~90% accuracy on the test set.
- Fused visual attention with NLP for multi-modal QA.
- Trained on medical imaging datasets achieving 90% test accuracy.
Skills
Machine Learning & AI (Expert): Encoder Models, Representation Learning, Information Retrieval, Transformers, LLMs, Prompt Engineering, Fine-tuning, Generative AI
Computer Vision (Advanced): Image Classification, ResNet, CNN, OpenCV, Image Quality Assessment, Visual Document Understanding
Tools & Frameworks (Expert): PyTorch, TensorFlow, Python, Git, GPT-4
Languages
English : Fluent
Hindi : Native
Interests
Research Interests: Information Retrieval, Representation Learning, Data Quality, Document Understanding, Multimodal AI