di Mara Tanelli e AA. VV.
Recent technological advances are making it possible to collect an extraordinary amount of data on how individuals behave, react, express themselves, move, and interact with each other. These data can be stored and processed with ‘’intelligent’’ algorithms to extract knowledge, hidden information, and, often, achieve a deeper understanding of the context. Such a technology-driven perspective in the field of social sciences is of great interest, as technology supports the analysis of large data sets and the development of tools that can bring the results to a wide public.
In this article, we discuss an example of such cross-feeding between economic science, linguistics and machine learning to analyse the impact of gender-biased language in corporate documents, which has been the object of a joint research project between Politecnico di Milano and NTT-Data, with the linguistic support of Prof. Cristina Mariotti of the Università degli Studi di Pavia.
To understand how language impacts the perpetration and diffusion of gender biases and stereotypes, it is worth noting that the phenomenon goes well beyond the corporate world. For example,  showed that scientific discourse between parents and children has inherent gender-related differences since early childhood, with repercussions on later study choices of the children themselves. In general,  found that parents were more likely to believe that science was less interesting and more difficult for daughters than for sons, and fathers tended to use more cognitively demanding speech with sons than with daughters when performing science tasks.
Stemming from these early signs is all the literature investigating the dichotomy between the so-called “agentic” vs. “communal” traits, which are a trademark of gender bias in behavioural psychology, see . These studies conceptualize both the gender stereotype and the gender self-concept as the distinction between more “masculine” and more “feminine” traits. The former are agentic-instrumental traits (e.g., active, decisive). The latter are communal-expressive ones (e.g., caring, emotional).
Such language traits, and their attribution to women and men, can indeed have an impact, also on the careers of individuals. The work in  shows that differences in the attribution of agentic and communal characteristics used in letters of recommendation to describe male and female candidates for academic positions can influence selection decisions. Such results are particularly important, as letters of recommendation continue to be heavily weighted and commonly used as selection tools. Along these lines, a stream of research analysing job postings shows that their language has a strong impact on the hiring process at all levels .
A quantitative assessment of (conscious and unconscious) gender biases in documents and communications can be of great help in raising awareness of these important issues. To build a language model able to spot the absence of gender neutrality and/or to highlight discrimination, two major approaches can be undertaken: semantic and pragmatic. The former is based on: a) looking for target words and exploiting the recognition of agentic-instrumental traits vs communal-expressive traits; and b) looking for keywords (e.g., man/woman, male/female) both in isolation and in compound phrases to detect the presence of non-neutral expressions (e.g., gentlemen’s agreement). The second and more complex approach is to reason in a pragmatic way, i.e., exploiting the reference and co-reference associations. These may unveil unneeded overextensions of the masculine forms to generic referents, .
Thanks to data analytics tools, one can in principle automatize these linguistic analyses to assess the presence of gender stereotypes by text-mining, . Once the desired features are extracted from data, and with deep context knowledge, one can develop specific scoring methods to compactly evaluate the company’s attitude towards gender equality and inclusion as a whole.
A first attempt to build such a system has given rise to the GeNTLE (Gender Neutrality Tool for Language Exploration) project, in which a prototype tool was developed. This tool takes as input corporate documents and – thanks to an inner layer of linguistics-informed automatic text analysis – provides as output a specific scoring along significant gender-related KPIs, together with the possibility of “deep diving” into the original documents to interpret and understand the aspects that were highlighted by the software analysis.
We believe that such an automatic analysis tool can indeed help companies to raise awareness about their true corporate culture on gender equality matters and about their written production, which is a powerful weapon to shape the firm’s identity on these matters.
The tool, while being able to provide useful recommendations on how to act to manage any gender-related communication bias hidden in company documents, should neither be intended to provide a direct judgment of the resulting scoring, nor to automatically enact the needed mitigation actions. The latter activities must be performed by dedicated specialists, working jointly with the firm’s executives. This creates a virtuous synergy between a technology-enabled analysis and the human-centric stage of discussion, analysis and reasoning, which is absolutely indispensable to initiate the cultural process that will eradicate the deepest roots of gender bias and stereotypes.
 Tenenbaum, R. et al., (2003) “Parent–Child Conversations About Science: The Socialization of Gender Inequities?” Developmental Psychology.
 A.E. Abele (2003) “The Dynamics of Masculine-Agentic and Feminine-Communal Traits: Findings from a Prospective Study”, Journal of Personality and Social Psychology.
 J.M. Madera at al., (2009) “Gender and Letters of Recommendation for Academia: Agentic and Communal Differences, Journal of Applied Psychology.
 Gaucher, D., Friesen, J., & Kay, A. C. (2011). “Evidence That Gendered Wording in Job Advertisements Exists and Sustains Gender Inequality”. Journal of Personality and Social Psychology.
 A. Ndobo (2013) “Discourse and attitudes on occupational aspirations and the issue of gender equality: What are the effects of perceived gender asymmetry and prescribed gender role?” European Review of Applied Psychology.
 Soon, W.M. et al., (2001) “A Machine Learning Approach to Co-reference Resolution of Noun Phrases”, MIT Press.