Genie

Inspired by the Gender Gap Tracker developed by Simon Fraser University, at LA NACION's Machine Learning Lab we decided to develop an AI model to track the gender gap in news articles in Spanish.

🔗 View Project

My Image

We developed an AI model to track the gender gap in news articles in Spanish by identifying the feminine, masculine, or non-binary gender of sources and quotes in its original content.

We worked on building the model based on a Natural Language Processing (NLP) division of Artificial Intelligence, utilizing the Python language. For person recognition, we employed the Spacy library in Spanish.

The gender determination process involves consulting various databases and APIs that relate names to genders, such as Wikidata and RENAPER (National Registry of Persons of Argentina), which provided 50 million records through a request for access to public information (FOIA). Besides determining the gender of individuals, the tool also identifies quotations and verbs used.

My role

As the product owner of the project, I not only coordinated the overall development but was also actively involved in building the AI model. I collaborated closely with data scientists and developers to design and implement the Natural Language Processing algorithms, leveraging the Spacy library for Spanish language processing. My role encompassed both overseeing the project's progress and directly contributing to the creation of the model, ensuring its accuracy in determining gender and extracting relevant data from articles.

Impact

The impact of the Genie project has been significant in fostering gender awareness within the newsroom. Since its launch, the tool has been integrated into several editorial sections, including Community, Society, and Culture, among others. By processing over 66,000 articles and analyzing around 136,000 individuals by gender, the project has provided valuable insights into the gender balance of news coverage, helping journalists make data-driven decisions to improve representation and inclusivity in their reporting.