CV | Akarsh Upadhyay

Contact Information

Name	Akarsh Upadhyay
Professional Title	Applied Scientist
Email	akarshupadhyayabc@gmail.com

Professional Summary

Applied Scientist at Microsoft working on ML for ad retrieval and monetization. IIT Jodhpur alumnus with experience in information retrieval, computer vision, and NLP.

Experience

2024 - present

India
Applied Scientist

Microsoft (MSAN - Microsoft Audience Network)
- Intent-Based Retrieval: Led development of next-generation encoder models for ad retrieval. Achieved +14.32 absolute precision points (P@100) over production baseline using 30x less training data.
- A/B Testing Impact: +0.61% CTR and +0.51% revenue lift in NA region. Methodology adopted as standard by US and India MSAN teams.
- Unified Evaluation Framework: Designed single source-of-truth evaluation framework across IDC, STCA, and US MSAN teams, supporting encoder-based, Single/Multi-Intent, Generative Retrieval, and ANN variants.
- High-Impact Index for Product Ads: ML-driven retrieval system identifying high-impact product offers; replaced legacy system, delivering 2.77% revenue lift.
2024 - 2024

India
Machine Learning Engineer

Zomato
- Image Quality Score: Fine-tuned ResNet-50 for food image quality classification (F1: 90%). Now a critical component evaluating nearly every food image shown to users.
- Ads Creation: Designed automated ad creation system using generative models for background generation and brand-specific styling.
- Photo Cake: Built real-time image overlay system using OpenCV; launched on Mother’s Day resulting in 3,000+ photo cake sales across India.
2023 - 2023

India
ML Research Intern

Enterpret
- Text Similarity at Scale: Led project to assess semantic similarity between sentences based on business value. Scaled solution to 1M+ texts, achieving F1 Score of 85%.
- Explored prompt engineering, transformer fine-tuning, novel loss functions, and LLMs for optimal results.
- Created comprehensive data preparation guidelines (20+ pages) and supervised the annotation team.

Education

2019 - 2023

Jodhpur, India
B.Tech

Indian Institute of Technology (IIT) Jodhpur

Electrical Engineering
- Core Member of the Robotics Club - participated in AI/ML competitions and hackathons.
- Research on Document Layout Understanding under Dr. Santanu Chaudhary.
- Project on Medical Visual Question Answering using NLP + Computer Vision.

Publications

2024

GDP: Generic Document Pretraining to Improve Document Understanding

18th International Conference on Document Analysis and Recognition (ICDAR)

A generic document pretraining approach that improves document understanding across various downstream tasks.

Projects

Document Layout Understanding

Multi-modal transformer (DocFormer) for Visual Document Understanding (VDU) in English and multilingual settings. Guided by Dr. Santanu Chaudhary, IIT Jodhpur.
- Applied DocFormer architecture for document understanding tasks.
- Extended to multilingual document settings.
Medical Visual Question Answering

VQA system for X-ray/MRI scans (brain, kidney, lungs) using NLP and Computer Vision (TensorFlow). Achieved ~90% accuracy on the test set.
- Fused visual attention with NLP for multi-modal QA.
- Trained on medical imaging datasets achieving 90% test accuracy.

Skills

Machine Learning & AI (Expert): Encoder Models, Representation Learning, Information Retrieval, Transformers, LLMs, Prompt Engineering, Fine-tuning, Generative AI

Computer Vision (Advanced): Image Classification, ResNet, CNN, OpenCV, Image Quality Assessment, Visual Document Understanding

Tools & Frameworks (Expert): PyTorch, TensorFlow, Python, Git, GPT-4

Languages

English : Fluent

Hindi : Native

Interests

Research Interests: Information Retrieval, Representation Learning, Data Quality, Document Understanding, Multimodal AI

Contact Information

Professional Summary

Experience

Applied Scientist

Microsoft (MSAN - Microsoft Audience Network)

Machine Learning Engineer

Zomato

ML Research Intern

Enterpret

Education

B.Tech

Indian Institute of Technology (IIT) Jodhpur

Electrical Engineering

Publications

GDP: Generic Document Pretraining to Improve Document Understanding

18th International Conference on Document Analysis and Recognition (ICDAR)

Projects

Document Layout Understanding

Medical Visual Question Answering

Skills

Languages

Interests