I am a Pre-Doctoral Researcher at Google DeepMind, where I work with Dr. Nitish Gupta, Dr. Karthikeyan Shanmugam and, Dr. Prateek Jain. My current research primarily focuses on diffusion models and enhancing the efficiency of large language models.

Before DeepMind, I was a Research Fellow (AI Resident) at Microsoft Research Lab - India with Dr. Manik Varma. My work focused on developing large-scale machine learning models that are parameter-efficient and capable of generalizing to new classes and out-of-distribution data.

Keywords: Efficiency in Large Language Models, Generative Modeling, Large-Scale Machine Learning, Information Retrieval

I graduated with a B.Tech. in Computer Science from Indian Institute of Technology (IIT) Gandhinagar in 2022. For more details about my background, refer to my CV. The best way to contact me is via email.

Updates

Nov 2024

Our work on mitigating missing labels in large-scale retrieval systems by leveraging world knowledge from LLMs has been accepted at SIGKDD 2025.

Oct 2024

Joined Google DeepMind in the Machine Learning and Optimization (MLO) Team.

Jul 2024

Presented our recent work on zero-shot retrieval at SIGKDD 2024 in Barcelona, Spain. (Oral Presentation)

Sep 2023

Released the benchmark ORCAS-800K, a dataset mapping user queries on the Bing search engine to the relevant subset of 800K web URLs.

May 2023

Our work addressing the semantic gap and data paucity issues in Siamese encoder training was accepted at SIGKDD 2023.

Jul 2022

Joined Microsoft Research in the Extreme Classification Team.

Jul 2022

Graduated with a bachelor's degree in Computer Science and Engineering from IIT Gandhinagar.

Publications, Benchmarks and Libraries

Extreme Meta-Classification

On the Necessity of World Knowledge for Mitigating Missing Labels in Extreme Classification Publication
Jatin Prakash, Anirudh Buvanesh, Bishal Santra, Deepak Saini, Sachin Yadav, Jian Jiao, Yashoteja Prabhu, Amit Sharma, Manik Varma
SIGKDD 2025   PDF

Extreme Meta-Classification

Extreme Meta-Classification for Large-Scale Zero-Shot Retrieval Publication
Sachin Yadav*, Deepak Saini*, Anirudh Buvanesh*, Bhawna Paliwal, Kunal Dahiya, Siddarth Asokan, Yashoteja Prabhu, Jian Jiao, and Manik Varma
SIGKDD 2024   PDFSlides

ORCAS-800K Benchmark

ORCAS-800K Benchmark
Dataset mapping user queries on the Bing search engine to the relevant subset of 800K web URLs
Website

Deep Encoders with Auxiliary Parameters

Deep Encoders with Auxiliary Parameters for Extreme Classification Publication
Kunal Dahiya, Sachin Yadav, Sushant Sondhi, Deepak Saini, Sonu Mehta, Jian Jiao, Sumeet Agarwal, Purushottam Kar, and Manik Varma
SIGKDD 2023   Website PDF

TinyGP

tinygp: the tiniest of Gaussian Process libraries Library
Dan Foreman-Mackey, Sachin Yadav, Andrew Fowlie, René Tronsgaard, Steve Schmerler, Thomas Killestein
GitHub

Deep Gaussian Processes

Deep Gaussian Processes for Air Quality Inference: Extended Abstract Publication
Aadesh Desai*, Eshan Gujarathi*, Saagar Parikh*, Sachin Yadav*, Zeel Patel, and Nipun Batra
CODS-COMAD 2023   PDF

Present and Past Affiliations

Pre-Doctoral Researcher at Google DeepMind

Research Fellow at Microsoft Research India

Research Intern at Samsung Research

B.Tech. at IIT Gandhinagar

High School at Mount Abu Public School