Hidden Voices: Reducing gender data gap, one Wikipedia article at a time

Published in Wiki Workshop, 2023

Neeraja Kirtane, Anuraag Shankar, Chelsi Jain, Ganesh Katrapati, Senthamizhan V, Raji Baskaran, Balaraman Ravindran

Wikipedia is the most widely available structured repository of information on the Internet. However, gender disparity has been observed in wiki articles, and it is a major issue. We aim to tackle this problem using Machine Learning methods to generate wiki-like biographies for notable women on Wikipedia. We present Hidden Voices, a project which will assist wiki editors and enthusiasts in writing more biographies about women, thereby increasing their representation on Wikipedia.

Download here

Mitigating gender stereotypes in Hindi and Marathi

Published in GeBNLP, NAACL, 2022

  • Neeraja Kirtane and Tanvi Anand

    Created a dataset of occupations and emotion in Hindi and Marathi. Proposed methods to quantify the bias in the word embeddings. Used existing methods to debias the embeddings.

Download here

Transformer based ensemble for emotion detection

Published in WASSA, ACL, 2022

Aditya Kane, Shantanu Patankar, Neeraja Kirtane, Sahil Khose

Developed ensemble based solution consisting of multiple ELECTRA and BERT models. Proposed methods for synthetically generating datasets to mitigate class imbalance. Studied the behaviour of our models on various raw and synthetically generated datasets.

Download here

Occupational Gender Stereotypes in Indian Languages

Published in WiNLP, EMNLP, 2021

  • Neeraja Kirtane and Tanvi Anand

    Devised a metric to calculate bias in gendered languages like Hindi and Marathi. Used this metric on ULMFiT language model and quantified the bias present.

Download here