Individual differences in the associative meaning of a word through the lens of the language model and semantic differential
The fact that there are individual differences in word semantics is recognized by lots of scholars. However, establishing and describing such differences is a complex scientific task involving labor-intensive semantic marking and the researcher’s inevitable impartiality. In this study, we set forth a method for identifying differences in the individual semantics of words based on automatically calculated estimates of the associative meaning of a word on semantic differential scales. Using the word2vec distributive semantic model, trained on a multimillion corpus of texts as well as the Concept Mover's Distance method, for each associative series we obtained (associative series consisting of the same number of elements – 5 words – had been selected) scores on 18 semantic differential scales. In our study, for the first time, this method, which has been widely used in the latest studies that consider text as data (mostly performed in the mainstream computational social science), has been applied to such an analysis object as an associative series for describing individual differences in the semantics of words. As the material for the study, we employed a specially designed dataset containing associative reactions to stimuli – high-frequency words in the Russian language, data on the respondents' psychological characteristics (Big Five traits) and their emotional state at the time of testing. Using a set of methods for analyzing multidimensional data (principal component method, factor analysis, hierarchical clustering on principal components), we divided the stimulus words into groups depending on the degree of individual differences in their semantics. We also established a connection between the respondents' psychological characteristics and automatically calculated the estimates of the associative meaning of the stimulus words on semantic differential scales. The described analysis technique can be used in order to obtain the estimates of associative series (as well as contexts of word use in texts) for any semantic oppositions and is set forth as a supplement to the traditional methods of identifying the psychologically real meaning of a word. The dataset used in the study and the code for reproducing the results obtained in the R language are available to a wider research community.
Figures
Litvinova, T. A. and Panicheva, P. V. (2024). Individual differences in the associative meaning of a word through the lens of the language model and semantic differential, Research Result. Theoretical and Applied Linguistics, 10 (1), 61-93. DOI: 10.18413/2313-8912-2024-10-1-0-5
While nobody left any comments to this publication.
You can be first.
Glukhov, V. P. (2005). Osnovy psikholingvistiki: uchebnoe posobie dlya studentov pedvuzov [Fundamentals of Psycholinguistics: a Textbook for Students of Pedagogical Universities], ACT Astrel' Publishing, Moscow, Russia. (In Russian)
Goroshko, E. I. (2001). Integrativnaja model' svobodnogo assotsiativnogo eksperimenta [Integrative Model of Free Associative Experiment], Institute of Linguistics RAS, Moscow, Russia; Izdatelskaia gruppa "RA-Karavella", Kharkov, Ukraine. (In Russian)
Zueva, E. A. (2006). Emotions as an object of linguistic research, Proc. of the Interuniversity scientific and practical conference “Foreign Languages in Vocational Education: Linguistic and Methodological Context”, Belgorod University of Cooperation, Economics & Law, Belgorod, Russia, 148-154. (In Russian)
Kalugin, A. Yu, Shchebetenko, S. A., Mishkevich, A. M., Soto, Ch. D. and John, O. (2021). Psychometric Properties of the Russian Version of the Big Five Inventory–2, Psikhologiya. Zhurnal Vysshej shkoly ekonomiki, 18 (1), 7–33. https://doi.org/10.17323/1813-8918-2021-1-7-33(In Russian)
Karaulov, Yu. N. and Korobova, M. M. (1993). Individual associative dictionary, Voprosy yazykoznaniya, 5, 5–15. (In Russian)
Kurganova, N. I. (2019). Association experiment as a method for studying the meaning of a living word, Voprosy psiholingvistiki, 3 (41), 24–37. https://doi.org/10.30982/2077-5911-2019-41-3-24-37(In Russian)
Litvinova, T. A., Zavarzina, V. A. and Lyubova, S. G. (2022). Database of associative reactions containing information about the keyboard behavior of respondents, Izvestiya Voronezhskogo gosudarstvennogo pedagogicheskogo universiteta, 4 (297), 240–249. https://doi.org/10.47438/2309-7078_2022_4_240(In Russian)
Litvinova, T. A., Kotlyarova, E. S., Lyubova, S. G. and Panicheva, P. V. (2023). The study of the meaning of a word in an individual linguistic consciousness using the method of semantic projection, Russian Linguistic Bulletin, [Electronic], 12 (48), available at: https://rulb.org/en/archive/12-48-2023-december/10.18454/RULB.2023.48.50 (Accessed 8 March 2024). https://doi.org/10.18454/RULB.2023.48.50(In Russian)
Litvinova, T. A. and Lyubova, S. G. (2023). Marking up associative reactions by types of "stimulus-associate" relationships as a stage in creating an annotated multicomponent corpus of associative reactions of an individual, Proc. ofthe XI International Scientific Conference ”Problems of studying the living Russian word at the turn of the millennium”, Voronezh State Pedagogical University, Voronezh, Russia, 49–58. (In Russian)
Novikov, A. L. and Novikova, I. A. (2011). The method of semantic differential: theoretical foundations and practice of application in linguistic and psychological research, Vestnik RUDN. Seriya: Teoriya yazyka. Semiotika. Semantika, 3, 63–71. (In Russian)
Petrenko, V. F. (2005). Osnovy psikhosemantiki [Fundamentals of psychosemantics], Piter, St. Petersburg, Russia. (In Russian)
Serkin, V. P. (2008). Metody psikhologii subyektivnoi semantiki i psikhosemantiki [Methods of psychology of subjective semantics and psychosemantics], PCHELA, Мoscow, Russia. (In Russian)
Sikevich, Z. V. (2016). The method of semantic differential in sociological research (application experience), Vestnik SPbGU. Seriya 12. Sotsiologiya, 3, 118–128. https://doi.org/10.21638/11701/spbu12.2016.309(In Russian)
Stepykin, N. I., Bagana, Zh., Slobodova, K. N. and Funikova, S. V. (2023). Investigation of the dynamics of the mental lexicon according to the data of a free associative experiment, Nauchny rezultat. Voprosy teoreticheskoi i prikladnoi lingvistiki, 2, 19–33. https://doi.org/10.18413/2313-8912-2023-9-2-0-2(In Russian)
Ufimtseva, N. V. (2009). Russian image of the world: consistency and content, Yazyk i kultura, 4 (8), 98-110. (In Russian)
Cherkasova, G. A. (2008). Russkiy sopostavitelny associativny slovar [Russian comparative associative dictionary], Institute of Linguistics RAS, Moscow, Russia. (In Russian)
Abdi, H. and Williams, L. J. (2010). Principal component analysis, Wiley Interdiscip. Rev., 2 (4), 433–459. https://doi.org/10.1002/wics.101 (In English)
Arseniev-Koehler, A. and Foster, J. G. (2022). Sociolinguistic Properties of Word Embeddings, in Dehghani, M. and Boyd, R. L. (eds.), Handbook of Language Analysis in Psychology, Guilford Press, New York, USA, 464-477. (In English)
Azucar, D., Marengo, D. and Settanni, M. (2018). Predicting the Big 5 personality traits from digital footprints on social media: A meta-analysis, Personality and Individual Differences, 124, 150-159. https://doi.org/10.1016/j.paid.2017.12.018(In English)
Caliskan, A., Bryson, J. J. and Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases, Science, 356 (6334), 183–186. https://doi.org/10.1126/science.aal4230(In English)
Daenekindt, S. and Schaap, J. (2022). Using word embedding models to capture changing media discourses: a study on the role of legitimacy, gender and genre in 24,000 music reviews, 1999–2021, Journal of Computational Social Science, 5, 1615–1636. https://doi.org/10.1007/s42001-022-00182-8(In English)
Eberhard, C. and Owens, W. A. (1975). Word Association as a Function of Biodata Subgrouping, Developmental Psychology, 11 (2), 159–164. (In English)
Ellis, N. C. (2019). Essentials of a Theory of Language Cognition, The Modern Language Journal, 103, 39–60. https://doi.org/10.1111/modl.12532(In English)
Garg, N., Schiebinger, L., Jurafsky, D. and Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes, Proceedings of the National Academy of sciences of the United States of America, 115 (16), E3635-E3644. https://doi.org/10.1073/pnas.1720347115(In English)
Grand, G., Blank, I. A., Pereira, F. and Fedorenko, E. (2022). Semantic Projection Recovers Rich Human Knowledge of Multiple Object Features from Word Embeddings, Nat Hum Behav, 6 (7), 975–987. https://doi.org/10.1038/s41562-022-01316-8(In English)
Greenwald, A. G., McGhee, D. E. and Schwartz, J. L. K. (1998). Measuring Individual Differences in Implicit Cognition. The Implicit Association Test, Journal of Personality and Social Psychology, 74 (6), 1464–1480. https://doi.org/10.1037/0022-3514.74.6.1464(In English)
Hollis, G. and Westbury, C. (2016). The principals of meaning: Extracting semantic dimensions from co-occurrence models of semantics, Psychonomic Bulletin & Review, 23, 1744–1756. https://doi.org/10.3758/s13423-016-1053-2(In English)
Husson, F., Josse, J. and Pages, J. (2010). Principal component methods-hierarchical clustering-partitional clustering: Why would we need to choose for visualizing data, Appl. Math. Dep., 17, 1–17. (In English)
Innes, J. M. (1972). The relationship of word-association commonality response set to cognitive and personality variables, Br J Psychol, 63 (3), 421-428. (In English)
Iordan, M. C., Giallanza, T., Ellis, C. T., Beckage, N. M. and Cohen, J. D. (2022). Context Matters: Recovering Human Semantic Structure from Machine Learning Analysis of Large-Scale Text Corpora, Cognitive science, 46 (2), e13085. https://doi.org/10.1111/cogs.13085(In English)
Isen, A. M., Johnson, M. M., Mertz, E. and Robinson, G. F. (1985). The influence of positive affect on the unusualness of word associations, J Pers Soc Psychol, 48 (6), 1413-1426. (In English)
Kassambara, A. and Mundt, F. (2017). Factoextra: Extract and visualize the results of multivariate data analyses, R Package Version, 1, 337–354. (In English)
Kozlowski, A. C., Taddy, M. and Evans, J. A. (2019). The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings, American Sociological Review, 84 (5), 905–949. https://doi.org/10.1177/0003122419877135(In English)
Lê, S., Josse, J. and Husson, F. (2008). FactoMineR: An R package for multivariate analysis, J. Stat. Softw., 25, 1–18. https://doi.org/10.18637/jss.v025.i01(In English)
Lenci, A. (2018). Distributional Models of Word Meaning, Annual Review of Linguistics, 4, 151–71. https://doi.org/10.1146/annurev-linguistics-030514-125254(In English)
Litvinova, T. A., Zavarzina, V. A., Kotlyarova, E. S. and Lyubova, S. G. (2023). Mapping the field of word association research using text mining approach, Proceedings of5th International Conference on Information Technology and Computer Communications (ITCC 2023), Tianjin, China, 90–98. https://doi.org/10.1145/3606843.3606858 (In English)
Litvinova, T. A. (2021). RusIdiolect: A New Resource for Authorship Studies, Lecture Notes in Networks and Systems, 186, 14-23. (In English)
Lukavsky, J. (2004). Subjective valence of the test words an enhancement of the word association test, Ceskoslovenska Psychologie, 48, 203-214. (In Czech)
Matsui, A., Ferrara, E. (2018). Word Embedding for Social Sciences: An Interdisciplinary Survey, available at: https://arxiv.org/pdf/2207.03086.pdf (Accessed 8 March 2024). (In English)
Merseal, H. M., Luchini, S., Kenett, Y. N., Knudsen, K., Bilder, R. M. and Beaty, R. E. (2023). Free association ability distinguishes highly creative artists from scientists: Findings from the Big-C Project, Psychology of Aesthetics, Creativity, and the Arts. Advance online publication. https://psycnet.apa.org/doi/10.1037/aca0000545(In English)
Merten, T. (1993). Word association responses and psychoticism, Personality and Individual Differences, 14 (6), 837–839. (In English)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. and Dean, J. (2013). Distributed Representations of Words and Phrases and Their Compositionality, Proceedings of the 26th International Conference on Neural Information Processing Systems,vol. 2, Nevada, USA, 3111–3119. https://doi.org/10.5555/2999792.2999959 (In English)
Osgood, C. E., Suci, G. J. and Tannenbaum, P. H. (1957). The Measurement of Meaning, University of Illinois Press, Chicago, USA. (In English)
Pereira, F., Gershman, S., Ritter, S. and Botvinick, M. (2016). A comparative evaluation of off-the-shelf distributed semantic representations for modelling behavioural data, Cognitive Neuropsychology, 33, 175–190. https://doi.org/10.1080/02643294.2016.1176907(In English)
Schröder, T., Hoey, J. and Rogers, K. B. (2016). Modeling Dynamic Identities and Uncertainty in Social Interactions: Bayesian Affect Control Theory, American Sociological Review, 81 (4), 828–855. https://doi.org/10.1177/0003122416650(In English)
Stoltz, D. S. and Taylor, M. A. (2019). Concept Mover's Distance: measuring concept engagement via word embeddings in texts, Journal of Computational Social Science, 2 (2), 293–313. https://doi.org/10.1007/s42001-019-00048-6(In English)
Stoltz, D. S. and Taylor, M. A. (2022). text2map: R tools for text matrices, Journal of Open Source Software, 7 (72), 3741. https://doi.org/10.21105/joss.03741(In English)
Taylor, M. A. and Stoltz, D. S. (2021). Integrating Semantic Directions with Concept Mover’s Distance to Measure Binary Concept Engagement, Journal of Computational Social Science, 4 (1), 231–242. https://doi.org/10.1007/s42001-020-00075-8(In English)
Utsumi, A. (2018). A Neurobiologically Motivated Analysis of Distributional Semantic Models, Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci2018), Madison, WI, USA, 1147-1152. (In English)
Voyer, A., Kline, Z. D., Danton, M. and Volkova, T. (2022). From Strange to Normal: Computational Approaches to Examining Immigrant Incorporation Through Shifts in the Mainstream, Sociological Methods & Research, 51 (4), 1540–1579. https://doi.org/10.1177/00491241221122596(In English)
Wulff, D. U., Aeschbach, S., De Deyne, S. and Mata, R. (2022). Data From the MySWOW Proof-of-Concept Study: Linking Individual Semantic Networks and Cognitive Performance, Journal of Open Psychology Data, 10 (5), 1–8. https://doi.org/10.5334/jopd.55(In English)
The research was carried out at Voronezh State Pedagogical University with the support of the Russian Science Foundation, Grant No. 21-78-10148 “Modeling the Meaning of a Word in Individual Linguistic Consciousness Based on Distributive Semantics”.