I am a Pre-Doctoral Researcher at Google DeepMind, where I work with Dr. Dheeraj Nagaraj, Dr. Karthikeyan Shanmugam and, Dr. Prateek Jain. My current research primarily focuses on diffusion models and enhancing the efficiency of large language models.
Before DeepMind, I was a Research Fellow (AI Resident) at Microsoft Research Lab - India with Dr. Manik Varma. My work focused on developing large-scale machine learning models that are parameter-efficient and capable of generalizing to new classes and out-of-distribution data.
Keywords: Efficiency in Large Language Models, Generative Modeling, Large-Scale Machine Learning, Information Retrieval
I graduated with a B.Tech. in Computer Science from Indian Institute of Technology (IIT) Gandhinagar in 2022. For more details about my background, refer to my CV. The best way to contact me is via email.
Our work on mitigating missing labels in large-scale retrieval systems was presented at SIGKDD 2025 in Toronto, Canada. [Session link]
Gemini 2.5 technical report is now available. Gemini 2.5 Pro is our most capable model yet, achieving state-of-the-art performance on frontier coding and reasoning benchmarks.
Our work on Gibbs-style generative modeling to produce discrete-continuous data, without assuming factorized denoised distribution is accepted to ICLR DeLTa Workshop 2025.
Our work on mitigating missing labels in large-scale retrieval systems by leveraging world knowledge from LLMs has been accepted at SIGKDD 2025.
Joined Google DeepMind in the Machine Learning and Optimization (MLO) Team.
Presented our recent work on zero-shot retrieval at SIGKDD 2024 in Barcelona, Spain. (Oral Presentation)
Released the benchmark ORCAS-800K, a dataset mapping user queries on the Bing search engine to the relevant subset of 800K web URLs.
Our work addressing the semantic gap and data paucity issues in Siamese encoder training is accepted at SIGKDD 2023.
Joined Microsoft Research in the Extreme Classification Team.
Graduated with a bachelor's degree in Computer Science and Engineering from IIT Gandhinagar.
Gemini 2.5 Technical Report
Publication
Gemini Team (including Sachin Yadav), Google DeepMind
arXiv, 2025 PDF
Interleaved Gibbs Diffusion for Constrained Generation
Publication
Gautham Govind Anil, Sachin Yadav, Dheeraj Nagaraj, Karthikeyan Shanmugam, Prateek
Jain
ICLR DeLTa Workshop 2025 Website PDF
On the Necessity of World Knowledge for Mitigating Missing Labels in Extreme Classification
Publication
Jatin Prakash, Anirudh Buvanesh, Bishal Santra, Deepak Saini, Sachin Yadav, Jian Jiao,
Yashoteja Prabhu, Amit Sharma, Manik Varma
SIGKDD 2025 PDF
Retrieval of novel keywords for search
Patent (Filed)
Deepak Saini, Jian Jiao, Sachin Yadav, Bhawna Paliwal, Anirudh Buvanesh, Manik Varma
US Patent App. 18/608,061, 2025 Page
Extreme Meta-Classification for Large-Scale Zero-Shot Retrieval
Publication
Sachin Yadav*, Deepak Saini*, Anirudh Buvanesh*, Bhawna Paliwal, Kunal Dahiya, Siddarth
Asokan, Yashoteja Prabhu, Jian Jiao, and Manik Varma
SIGKDD 2024 PDFSlides
ORCAS-800K
Benchmark
Dataset mapping user queries on the Bing search engine to the relevant subset of 800K web URLs
Sachin Yadav*, Deepak Saini* Website
Deep Encoders with Auxiliary Parameters for Extreme Classification
Publication
Kunal Dahiya, Sachin Yadav, Sushant Sondhi, Deepak Saini, Sonu Mehta, Jian Jiao, Sumeet
Agarwal, Purushottam Kar, and Manik Varma
SIGKDD 2023 Website PDF
tinygp: the tiniest of Gaussian Process libraries
Library
Dan Foreman-Mackey, Sachin Yadav, Andrew Fowlie, René Tronsgaard, Steve Schmerler, Thomas
Killestein
GitHub
Deep Gaussian Processes for Air Quality Inference: Extended Abstract
Publication
Aadesh Desai*, Eshan Gujarathi*, Saagar Parikh*, Sachin Yadav*, Zeel Patel, and Nipun
Batra
CODS-COMAD 2023 PDF
Gemini Diffusion
Gemini 2.5
Microsoft Bing
Pre-Doctoral Researcher at Google DeepMind
Research Fellow at Microsoft Research India
Research Intern at Samsung Research
B.Tech. at IIT Gandhinagar
High School at Mount Abu Public School