Publications

FactCheckmate: Preemptively Detecting and Mitigating Hallucinations in LMs

Published in EMNLP Findings, 2025

Deema Alnuhait*, Neeraja Kirtane*, Muhammad Khalifa, Hao Peng (* - equal contribution)

Download here

Evaluation of Large Language Models’ Robustness to Linguistic Variations in Mathematical Reasoning

Published in Under Review, 2025

Neeraja Kirtane*, Yuvraj Khanna*, Peter Relan (* - equal contribution)

Download here

LLMs are Vulnerable to Malicious Prompts Disguised as Scientific Language

Published in SoLAR workshop at COLM, 2025

Yubin Ge*,Neeraja Kirtane*,Hao Peng, Dilek Hakkani-Tür (* - equal contribution)

Download here

Efficient Gender Debiasing of Pre-trained Indic Language Models

Published in Deployable AI workshop, AAAI, 2023

Neeraja Kirtane, Aditya Kane, V Manushree

Quantified bias in Hindi Language model- Muril.Efficiently finetuned by unfreezing less than 1 percent of the parameters. Results showed that debiasing reduced the bias.

Download here

Hidden Voices: Reducing gender data gap, one Wikipedia article at a time

Published in Wiki Workshop, 2023

Neeraja Kirtane, Anuraag Shankar, Chelsi Jain, Ganesh Katrapati, Senthamizhan V, Raji Baskaran, Balaraman Ravindran

Wikipedia is the most widely available structured repository of information on the Internet. However, gender disparity has been observed in wiki articles, and it is a major issue. We aim to tackle this problem using Machine Learning methods to generate wiki-like biographies for notable women on Wikipedia. We present Hidden Voices, a project which will assist wiki editors and enthusiasts in writing more biographies about women, thereby increasing their representation on Wikipedia.

Download here

ReGrAt: Regularization in graphs using attention mechanism to handle class imbalance

Published in GCLR workshop, AAAI, 2023

Neeraja Kirtane, Jeshuren Chelladurai, Balaraman Ravindran, Ashish Tendulkar

Used attention mechanism to tackle imbalance. Used a custom loss function by adding a regularizer that handles imbalance. Got better results than already existing methods.

Download here

Mitigating gender stereotypes in Hindi and Marathi

Published in GeBNLP, NAACL, 2022

Neeraja Kirtane and Tanvi Anand

Created a dataset of occupations and emotion in Hindi and Marathi. Proposed methods to quantify the bias in the word embeddings. Used existing methods to debias the embeddings.

Download here

Transformer based ensemble for emotion detection

Published in WASSA, ACL, 2022

Aditya Kane, Shantanu Patankar, Sahil Khose, Neeraja Kirtane

Developed ensemble based solution consisting of multiple ELECTRA and BERT models. Proposed methods for synthetically generating datasets to mitigate class imbalance. Studied the behaviour of our models on various raw and synthetically generated datasets.

Download here

Occupational Gender Stereotypes in Indian Languages

Published in WiNLP, EMNLP, 2021

Neeraja Kirtane and Tanvi Anand

Devised a metric to calculate bias in gendered languages like Hindi and Marathi. Used this metric on ULMFiT language model and quantified the bias present.

Download here