About me
I’m Neeraja Kirtane, a recent MSCS graduate from the University of Illinois Urbana-Champaign (UIUC), where I was advised by Prof. Hao Peng and Prof. Dilek Hakkani-Tür.
My research focuses on building trustworthy and interpretable NLP systems. At UIUC, I worked on:
- FactCheckmate: Preemptively detecting and mitigating hallucinations in large language models (LLMs) by analyzing their hidden states.
- Jailbreaking LLMs: Designing scientific-sounding prompts that elicit biased or toxic responses from LLMs, revealing vulnerabilities in instruction following.
Currently, I am:
- Collaborating with Prof. Kuan-Hao Huang (Texas A&M University) on probing reasoning in multilingual LLMs using interpretability tools.
- Interning at MathGPT.ai, where I’m developing a benchmark to evaluate the reasoning robustness of SOTA models and fine-tuning small language models (SLMs) for AI tutoring needs and applications.
Previously, I was a Post-Baccalaureate Research Fellow at the Robert Bosch Centre for Data Science and AI (RBCDSAI) at IIT Madras, advised by Prof. Balaraman Ravindran, where I built intelligent tools to generate Wikipedia biographies for women in STEM.
I’m currently seeking PhD opportunities or full-time research roles in AI safety, robustness, and interpretability.
If you’re interested in my work or would like to collaborate, feel free to reach out via Email or LinkedIn. You can also view my CV for more details.