I am Kshitij Ambilduke, a first-year master’s student at Université Paris-Saclay pursuing Computer Science (AI Track). My research interests broadly surround Natural Language Processing (NLP) and I am particularly excited about extensions of NLP towards other modalities like speech and vision.
Before my master's, I worked on extending LLMs to speech as a Research Assistant under the guidance of Prof. André F. T. Martins
I completed my bachelor’s in Electronics and Communication from Visvesvaraya National Institute of Technology (VNIT), India. I did my bachelor's thesis with Dr. Anamika Singh where I worked on image captioning. During my pre-final year summer, I worked on interpretable Visual Question Answering under the supervision of Prof. André F. T. Martins and Prof. Bruno Martins.
I was also fortunate to be a part of IvLabs, the AI and Robotics lab of VNIT, where I developed most of my research interests thanks to the oversight of **Prof. Shital Chiddarwar**
:ligne-icône-de-résumé-cv-103642523--1-: Résumé
:linkdin: KshitijAmbilduke
:gscholar: KshitijAmbilduke
:github: KshitijAmbilduke
$$ \color{337ea9}\rule{630px}{5px} $$
From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM
Kshitij Ambilduke, Ben Peters, Sonal Sannigrahi, Anil Keshwani, Tsz Kin Lam, Bruno Martins, Marcely Zanon Boito, André F.T. Martins
Under review, Preprint: https://arxiv.org/abs/2503.10620
Attending to Transforms: A Survey on Transformer-based Image Captioning
Kshitij Ambilduke, Thanmay Jayakumar, Luqman Farooqui, Himanshu Padole, Anamika Singh
Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS 2023)
Enhancing Context through Contrast
Kshitij Ambilduke, Aneesh Shetye, Diksha Bagade, Rishika Bhagwatkar, Khurshed Fitter, Prasad Vagdargi, Shital Chiddarwar
NeurIPS 2021 Workshop on Pre-registration in Machine Learning