RT Report T1 Gender Distribution across Topics in the Top 5 Economics Journals: A Machine Learning Approach A1 Conde Ruiz, José Ignacio A1 Ganuza, Juan-José A1 García, Manu A1 Puch, Luis A. AB We analyze all the articles published in the top five (T5) Economics journals be- tween 2002 and 2019 in order to find gender differences in their research approach. We implement an unsupervised machine learning algorithm: the Structural Topic Model (STM), so as to incorporate gender document-level meta-data into a probabilistic text model. This algorithm characterizes jointly the set of latent topics that best fits our data (the set of abstracts) and how the documents/abstracts are allocated to each latent topic. Latent topics are mixtures over words where each word has a probability of belonging to a topic after controlling by journal name and publication year (the meta-data). Thus, the topics may capture research fields but also other more subtle characteristics related to the way in which the articles are written. We find that fe- males are unevenly distributed along the estimated latent topics, by using only data driven methods. This finding relies on “automatically” generated built-in data given the contents in the abstracts of the articles in the T5 journals, without any arbitrary allocation of texts to particular categories (as JEL codes, or research areas). SN 2341-2356 YR 2021 FD 2021-07 LK https://hdl.handle.net/20.500.14352/5765 UL https://hdl.handle.net/20.500.14352/5765 LA eng NO Bagues, Manuel and Pamela Campa, “Can Gender Quotas in Candidate Lists Empower Women? Evidence from a Regression Discontinuity Design,” 2017, (12149).Bayer, Amanda and Cecilia E. Rouse, “Diversity in the Economics Profession: A New Attack on an Old Problem,” Journal of Economic Perspectives, Nov. 2016, 30 (4), 221–42.Beneito, P., J. E. Boscá, J. Ferri, and M. García, “Women across Subfi in Economics: Relative Performance and Beliefs,” Fedea WP, June 2018, (2018 - 06).Blei, David M., Andrew Y. Ng, and Michael I. Jordan, “Latent Dirichlet Allocation,” J. Mach. Learn. Res., March 2003, 3 (null), 993 – 1022.Boustan, Leah and Andrew Langan, “Variation in Women’s Success across PhD Pro- grams in Economics,” Journal of Economic Perspectives, February 2019, 33 (1), 23–42.Buckley, Chris, “Implementation of the SMART Information Retrieval System,” Technical Report, USA 1985.Cabrales, A., M. García, and L. A. Puch, “Gendered Language in the British Press,” Mimeo COSME Gender, at 2018 Meetings of the Spanish Economic Association, 2018.Card, David and Stefano DellaVigna, “Nine Facts about Top Journals in Economics,” Journal of Economic Literature, March 2013, 51 (1), 144–61.Card, David, Stefano DellaVigna, Patricia Funk, and Nagore Iriberri, “Are Referees and Editors in Economics Gender Neutral?*,” The Quarterly Journal of Economics, 11 2019, 135 (1), 269–327.Chari, Anusha and Paul Goldsmith-Pinkham, “Gender Representation in Economics Across Topics and Time: Evidence from the NBER Summer Institute,” Working Paper 23953, National Bureau of Economic Research October 2017.Chevalier, Judy, “The 2020 Report of the Committee on the Status of Women in the Economics Profession,” 2020. Conde-Ruiz, J. Ignacio, Juan-José Ganuza, and Paola Profeta, “Statistical Dis- crimination and the Efficiency of Quotas,” Fedea Working Papers, 2017.Conde-Ruiz, J. Ignacio, Juan José Ganuza, and Paola Profeta, “Statistical Discrimination and Commit- tees,” Fedea Working Papers, February 2021, (2021-06).Dolado, Juan, Florentino Felgueroso, and Miguel Almunia, “Are men and women- economists evenly distributed across research fi Some new empirical evidence,” SE- RIEs: Journal of the Spanish Economic Association, September 2012, 3 (3), 367–393.Hansen, Stephen, Michael McMahon, and Andrea Prat, “Transparency and De- liberation Within the FOMC: A Computational Linguistics Approach,” The Quarterly Journal of Economics, 10 2017, 133 (2), 801–870.Heckman, James J. and Sidharth Moktan, “Publishing and Promotion in Economics: The Tyranny of the Top Five,” Journal of Economic Literature, June 2020, 58 (2), 419–70.Hengel, E., “Publishing while Female. Are women held to higher standards? Evidence from peer review,” Cambridge Working Papers in Economics 1753, Faculty of Economics, University of Cambridge December 2020.Hengel, Erin and Eunyoung Moon, “Gender and quality at top economics journals,” Working Papers 202001, University of Liverpool, Dept. of Economics February 2020.Lundberg, Shelly and Jenna Stearns, “Women in Economics: Stalled Progress,” Jour- nal of Economic Perspectives, February 2019, 33 (1), 3–22.Mimno, David, Hanna Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum, “Optimizing Semantic Coherence in Topic Models,” 2011, pp. 262 – 272.Roberts, Margaret E., Brandon M. Stewart, and Dustin Tingley, “stm: An R Package for Structural Topic Models,” Journal of Statistical Software, 2019, 91 (2), 1–40.Roberts, Margaret E., Brandon M. Stewart, and Dustin Tingley, Christopher Lucas, Jetson Leder-Luis, Shana Kushner Gadarian, Bethany Albertson, and David G. Rand, “Structural Topic Models for Open-Ended Survey Responses,” American Journal of Political Science, 2014, 58 (4), 1064–1082. Siniscalchi, Marciano and Veronesi, Pietro, “Self-image Bias and Lost Talent,” De- cember 2020, (28308).Tang, Cong, Keith Ross, Nitesh Saxena, and Ruichuan Chen, “What’s in a Name: A Study of Names, Gender Inference and Gender Behavior in Facebook,” in “Xu J., Yu G., Zhou S., Unland R. (eds) Database Systems for Advanced Applications Lecture Notes in Computer Sciece, vol 6637,” Springer Berlin Heidelberg, 2011, pp. 344 – 356. NO We thank Antonio Cabrales, Pedro Delicado and Nagore Iriberri for helpful comments, and Elvira Alonso for excellent research assistance. We also thank the Editor and two anonymous referees for their suggestions, as well as session participants at Computing in Economics & Finance Conference, Tokyo (virtual) 2021.José Ignacio Conde-Ruiz and, Manu García and Luis Puch, respectively, acknowledge the Spanish Ministry of Science and Innovation for financial support through projects PID2019-105499GB-I00 and PID2019-107161GB-C32. Juan-José Ganuza gratefully acknowledges the financial support from the Spanish Agencia Estatal de Investigación, through the Severo Ochoa Programme for Centres of Excellence in R&D (CEX2019-000915-S) and the Spanish Ministry of Education and Science Through Project ECO2017-89240-P.†Corresponding Author: Juan-Jose Ganuza, Universitat Pompeu Fabra, Ramon Trias Fargas 27, 08005, Spain; E-mail: juanjo.ganuza@gmail.com NO Ministerio de Ciencia e Innovación (MICINN) NO Centro de Excelencia Severo Ochoa NO Ministerio de Educación DS Docta Complutense RD 3 may 2024