WiDS

Women in Data Science: Humans are Biased not Technology

27 May 2024

Data

The Women in Data Science (WiDS) conference in Ghent, held on May 17, is part of a global initiative designed to inspire and educate data professionals worldwide, regardless of gender, and to support women in the field. Two of our colleagues, Rossana Della Ghelfa (Smart Data Consultant) and Cin Vermeiren (Smart Data Specialised Sales), attended this first-ever WiDS conference held in Belgium, and they are happy to share some insights with you. 

Why have a conference specifically for women in data science? 

Rossana & Cin

Rossana (left): The conference offered a fantastic opportunity to interact and exchange with other female data science professionals, from both academia and industry. Which can be particularly valuable for women who may feel a bit isolated in their workplaces. However, there may be a slight misunderstanding about WiDS conferences. While the focus is on empowering women, and the speakers are female, WiDS conferences are typically open to everyone, regardless of gender. The conference highlighted the importance of diversity and inclusion in data science, including bridging the gender gap. So, to my opinion, the key messages and notes were valuable not only for women, but for men as well.  

Cin (right): One of the reasons for having the WiDS conference is to address gender bias in data science, and more particularly in data models, which are still predominantly ‘white male’. In her session, Malvina Nissim, a Computational Linguistics professor at the University of Groningen, zoomed in on bias and highlighted several examples of gender bias found in language models such as Google Translate and ChatGPT. One common example is the bias towards associating nurses with women and doctors or professors with men, which reinforces existing gender stereotypes. That is how the model works by default today. However, if data samples used to train language models are not representative of the entire population, certain groups, including women, will be underrepresented. The conference in Ghent aimed to raise awareness of this issue and to promote diversity and inclusion within the field. 

Gender Bias Translate

Rossana: One of the ways to achieve this is by encouraging more participation from women and ensuring data used to build models comes from diverse groups. This enriches the data and helps create more accurate and unbiased models, which is crucial for ensuring accuracy, impartiality, and not in the least responsible use of technology such as AI.  

So, actually, humans are biased not technology? 

Rossana: One of the key takeaways for me was that we need to focus on removing prejudice from society, not from technology itself. Technology can be incredibly powerful. There is virtually no limit to its possibilities. However, data models and algorithms often reflect the biases present in the data they are trained on - data which is generated by humans who can make mistakes and hold unconscious biases. This, at the same time, raises another concern. Can we overcorrect or overcompensate? What if, in our best efforts to minimize or even remove bias, the pendulum swings too far in the other direction? 

Overcompensation

Cin: As was the case in February, when Google's AI model, Gemini, was accused of anti-White bias when it generated images of an Asian female pope. Highlighting the need for a nuanced approach. While ensuring diversity in data and data modeling is crucial, it's important to strive for factual accuracy as well. 

Censorship

Rossana: During her session, Malvina Nissim also challenged us to think about the relevance of censorship in ChatGPT. The example she provided was that if you ask ChatGPT to make a joke about women, the AI will state that it’s inappropriate. However, if asked to do so, it will deliver a joke about men. Which actually raised a few eyebrows. For me, data is capital and AI holds immense potential, but we are not there yet. There are still many aspects to consider, especially on sensitive topics. 

Cin: Similarly, and to my personal opinion, the complex and sometimes conflicting relationship between AI and sustainability also requires our careful consideration. Even though I must admit that, in her keynote, Jenny Ambukiyenyi Onya brought a very positive story on data and sustainable economics, talking about Halisi Livestock which uses advanced AI and biometrics to help farmers and financial institutions improve capital access for underserved populations. As such transforming the agri-fintech landscape in Africa.  

Data science is everywhere in the news these days, especially with the rise of AI. Is it purely a technical story, or does it go beyond that?  

Cin: Technology and AI in particular undoubtedly improves productivity and efficiency, especially for repetitive or dangerous or filthy tasks. However, the focus should shift towards leveraging AI to augment human capabilities, not to replace them. Imagine us becoming ‘superhumans,’ with technology enhancing our thinking and acting. Think about AI-powered implants that can enable people to hear again or walk after paralysis. In this respect, I would like to share a personal recommendation and favorite of mine: "The Digital Dilemma." A documentary series on VRT MAX by journalist Tim Verheyden, exploring the impact of technological innovation on people and society.  

What is the added value of women in data science?  

Cin: Our brain works differently, and diversity of thought is crucial in data science and business translation. People from different backgrounds bring unique perspectives that can lead to innovative solutions. At KBC, a high number of data scientists are women. This success is due to a combination of factors, including the STEM educational initiatives, but also intrinsic strong math skills and academic achievements often found in women, as well as their potential for different approaches. 

Rossana: A more diverse group will always deliver more diverse solutions. We are now focusing on women, but it also applies to other minorities such as other nationalities, ages, or people with a disability. For me, it is a classic chicken and egg situation, or a catch 22 if you want. We need more women with different backgrounds and mindsets in data science to reduce bias, but the lack of diversity might discourage some women from entering the field. Conferences like the one in Ghent can play a crucial role in raising awareness and breaking those boundaries. 

Do you have more insights to share with us?  

Cin: What struck me was the session on the invisibility of female data in scientific research. It highlighted a critical issue, illustrating what the lack of female data in scientific research can cause. Problems such as ineffective medication or adverse drug reactions, exposing women to overdosing. Even the lack of data during pregnancy, often motivated by the desire to avoid risks, leaving women with limited options for managing various health conditions during their pregnancy. So, let us please close the gap.  

Female Data

Thank you both very much for sharing and helping us break the boundaries! 

Do you have a passion for data science or ICT and want to make a real impact?  

Then come and join us at Inetum! We offer a diverse and inclusive work environment with mentorship and professional development opportunities, where you can contribute to innovative solutions that drive our success and shape the unbiased future of technology. If you are looking for a place where you can grow and make a significant impact, do reach out.