Track: General: Ethics The talk is an attempt to measure biases in most popular language models and we propose a solution to reduce the bias, and promote social inclusion and diversity based on gender. We have covered both methods on contextual and non contextual word em- bedding debiasing techniques. We have also tried to compare the biases in different models, like Flair, Bert and glove. The dataset used is Winograd-schema style sentences with entities corresponding to people referred by their occupation (e.g. the nurse, the doctor, the carpenter). The use of AI in sensitive areas including for hiring, criminal justice and health- care makes it more important to look under the hood for bias and fairness. AI being shaped by flawed and societal biases. Underlying data rather than the algorithm itself are most often the main source of the issue. and how can we use finetuning and projection methods to overcome those biases in models There have been several cases where google translator or any other language models have given racial or gender biased results. When a gender neutral language like finnish is translated to English it gives male biased results. Due to word embeddings trained on news articles may exhibit the gender stereotypes found in society. We have finetuned model and have tried debiasing non contextual embeddings.
Recorded at the PyConDE & PyData Berlin 2022 conference, April 11-13 2022. https://2022.pycon.de More details at the conference page: https://2022.pycon.de/program/HXCMKR Twitter: https://twitter.com/pydataberlin Twitter: https://twitter.com/pyconde