<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<article article-type="research-article" dtd-version="1.2" xml:lang="ru" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="issn">2313-8912</journal-id><journal-title-group><journal-title>Research Result. Theoretical and Applied Linguistics</journal-title></journal-title-group><issn pub-type="epub">2313-8912</issn></journal-meta><article-meta><article-id pub-id-type="doi">10.18413/2313-8912-2025-11-4-0-1</article-id><article-id pub-id-type="publisher-id">3979</article-id><article-categories><subj-group subj-group-type="heading"><subject>EDITORIAL</subject></subj-group></article-categories><title-group><article-title>&lt;strong&gt;Revealing Cultural Meaning with Trilingual Embeddings: A New Audit of LLM Multilingual Behavior&lt;/strong&gt;</article-title><trans-title-group xml:lang="en"><trans-title>&lt;strong&gt;Revealing Cultural Meaning with Trilingual Embeddings: A New Audit of LLM Multilingual Behavior&lt;/strong&gt;</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Litvinova</surname><given-names>Tatiana A.</given-names></name><name xml:lang="en"><surname>Litvinova</surname><given-names>Tatiana A.</given-names></name></name-alternatives><email>centr_rus_yaz@mail.ru</email><xref ref-type="aff" rid="aff1" /></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Dekhnich</surname><given-names>Olga V.</given-names></name><name xml:lang="en"><surname>Dekhnich</surname><given-names>Olga V.</given-names></name></name-alternatives><email>dekhnich@bsu.edu.ru</email><xref ref-type="aff" rid="aff2" /></contrib></contrib-group><aff id="aff2"><institution>Belgorod State National Research University, Russia</institution></aff><aff id="aff1"><institution>Voronezh State Pedagogical University, Russia</institution></aff><pub-date pub-type="epub"><year>2025</year></pub-date><volume>11</volume><issue>4</issue><fpage>0</fpage><lpage>0</lpage><self-uri content-type="pdf" xlink:href="/media/linguistics/2025/4/Лингвистика_411-3-22.pdf" /><abstract xml:lang="ru"><p>Large Language Models (LLMs) are increasingly regarded as authoritative mediators of multilingual meaning; however, their ability to preserve culturally grounded lexical distinctions remains uncertain. This issue is especially critical for the core lexicon &amp;ndash; high-frequency, culturally salient words that constitute the conceptual foundation of linguistic cognition within a community. If these foundational meanings are distorted, the resulting semantic shifts can propagate through downstream tasks, interpretations, and educational applications. Despite this risk, robust methods for evaluating LLM fidelity to culturally embedded lexical semantics remain largely undeveloped. This editorial introduces a novel diagnostic approach based on trilingual aligned word embeddings for Russian, Lingala, and French. By aligning embeddings into a shared distributional space, we obtain an independent semantic reference that preserves the internal structure of each language. French serves as a high-resource pivot, enabling comparisons without forcing the low-resource language into direct competition with English or Russian embedding geometries.

We examine several culturally central lexical items &amp;ndash; including kinship and evaluative terms &amp;ndash; to illustrate how an aligned manifold can reveal potential points of semantic tension between LLM outputs and corpus-grounded meanings. While our case studies do not claim to expose fully systematic biases, they demonstrate how the proposed framework can uncover subtle discrepancies in meaning representation and guide a more comprehensive investigation.

We argue that embedding-based diagnostics provide a promising foundation for auditing the behavior of multilingual LLMs, particularly for low-resource languages whose semantic categories risk being subsumed under English-centric abstractions. This work outlines a research trajectory rather than a completed map and calls for deeper, community-centered efforts to safeguard linguistic and cultural specificity in the age of generative AI.



</p></abstract><trans-abstract xml:lang="en"><p>Large Language Models (LLMs) are increasingly regarded as authoritative mediators of multilingual meaning; however, their ability to preserve culturally grounded lexical distinctions remains uncertain. This issue is especially critical for the core lexicon &amp;ndash; high-frequency, culturally salient words that constitute the conceptual foundation of linguistic cognition within a community. If these foundational meanings are distorted, the resulting semantic shifts can propagate through downstream tasks, interpretations, and educational applications. Despite this risk, robust methods for evaluating LLM fidelity to culturally embedded lexical semantics remain largely undeveloped. This editorial introduces a novel diagnostic approach based on trilingual aligned word embeddings for Russian, Lingala, and French. By aligning embeddings into a shared distributional space, we obtain an independent semantic reference that preserves the internal structure of each language. French serves as a high-resource pivot, enabling comparisons without forcing the low-resource language into direct competition with English or Russian embedding geometries.

We examine several culturally central lexical items &amp;ndash; including kinship and evaluative terms &amp;ndash; to illustrate how an aligned manifold can reveal potential points of semantic tension between LLM outputs and corpus-grounded meanings. While our case studies do not claim to expose fully systematic biases, they demonstrate how the proposed framework can uncover subtle discrepancies in meaning representation and guide a more comprehensive investigation.

We argue that embedding-based diagnostics provide a promising foundation for auditing the behavior of multilingual LLMs, particularly for low-resource languages whose semantic categories risk being subsumed under English-centric abstractions. This work outlines a research trajectory rather than a completed map and calls for deeper, community-centered efforts to safeguard linguistic and cultural specificity in the age of generative AI.



</p></trans-abstract><kwd-group xml:lang="ru"><kwd>Large Language Models</kwd><kwd>Trilingual Embeddings</kwd><kwd>Cultural Semantics</kwd><kwd>Low-Resource Languages</kwd><kwd>Multilingual NLP</kwd><kwd>Semantic Drift</kwd><kwd>Cross-Lingual Alignment</kwd><kwd>Linguistic Cognition</kwd><kwd>Multilingual AI Audit</kwd><kwd>Distributional Semantics</kwd></kwd-group><kwd-group xml:lang="en"><kwd>Large Language Models</kwd><kwd>Trilingual Embeddings</kwd><kwd>Cultural Semantics</kwd><kwd>Low-Resource Languages</kwd><kwd>Multilingual NLP</kwd><kwd>Semantic Drift</kwd><kwd>Cross-Lingual Alignment</kwd><kwd>Linguistic Cognition</kwd><kwd>Multilingual AI Audit</kwd><kwd>Distributional Semantics</kwd></kwd-group></article-meta></front><back><ack><p>Tatiana A. Litvinova acknowledges the support of the Ministry of Education of the Russian Federation (the research was supported by the Ministry of Education of the Russian Federation within the framework of the state assignment in the field of science, topic number QRPK-2025-0013). Olga V. Dekhnich received no financial support for the research, authorship, and publication of this article.</p></ack><ref-list><title>Список литературы</title><ref id="B1"><mixed-citation>Artetxe,&amp;nbsp;M., Labaka,&amp;nbsp;G. and Agirre,&amp;nbsp;E. (2018). A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), 789&amp;ndash;798. https://doi.org/10.18653/v1/P18-1073 (In English).</mixed-citation></ref><ref id="B2"><mixed-citation>Bird,&amp;nbsp;S. (2020). Decolonising speech and language technology, in Proceedings of the 28th International Conference on Computational Linguistics, 3504&amp;ndash;3519, Barcelona, Spain (Online). International Committee on Computational Linguistics. https://doi.org/10.18653/v1/2020.coling-main.313 (In English).</mixed-citation></ref><ref id="B3"><mixed-citation>Blasi,&amp;nbsp;D.&amp;nbsp;E., Anastasopoulos, A. and Neubig, G. (2022). Systematic inequalities in language technology performance across the world&amp;rsquo;s languages, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, 5486&amp;ndash;5505, May 22-27, 2022. DOI: 10.18653/v1/2022.acl-long.376 (In English).</mixed-citation></ref><ref id="B4"><mixed-citation>Goddard,&amp;nbsp;C. and Wierzbicka,&amp;nbsp;A. (2014). Words and meanings: Lexical semantics across domains, languages, and cultures, Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199668434.001.0001 (In English).</mixed-citation></ref><ref id="B5"><mixed-citation>Guo,&amp;nbsp;Y., Conia,&amp;nbsp;S., Zhou,&amp;nbsp;Z., Li,&amp;nbsp;M., Potdar,&amp;nbsp;S. and Xiao,&amp;nbsp;H. (2024). Do Large Language Models Have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs, Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2410.15956 (In English).</mixed-citation></ref><ref id="B6"><mixed-citation>Joshi,&amp;nbsp;P., Santy,&amp;nbsp;S., Budhiraja,&amp;nbsp;A., Bali,&amp;nbsp;K. and Choudhury,&amp;nbsp;M. (2020). The state and fate of linguistic diversity in the NLP world, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online. Association for Computational Linguistics., 6282&amp;ndash;6293. https://doi.org/10.18653/v1/2020.acl-main.560 (In English).</mixed-citation></ref><ref id="B7"><mixed-citation>Li,&amp;nbsp;C., Chen,&amp;nbsp;M., Wang,&amp;nbsp;J., Sitaram,&amp;nbsp;S. and Xie,&amp;nbsp;X. (2024). CultureLLM: incorporating cultural differences into large language models, in Proceedings of the 38th International Conference on Neural Information Processing Systems (NIPS &amp;#39;24), Vol. 37. Curran Associates Inc., Red Hook, NY, USA, Article 2693, 84799&amp;ndash;84838. https://doi.org/10.52202/079017-2693 (In English).</mixed-citation></ref><ref id="B8"><mixed-citation>Litvinova,&amp;nbsp;T.&amp;nbsp;A., Mikros,&amp;nbsp;G.&amp;nbsp;K. and Dekhnich, O. V. (2024). Writing in the era of large language models: a bibliometric analysis of research field. Research Result, Theoretical and Applied Linguistics, 10&amp;nbsp;(4), 5&amp;ndash;16. https://doi.org/10.18413/2313-8912-2024-10-4-0-1 (In English)</mixed-citation></ref><ref id="B9"><mixed-citation>Liu,&amp;nbsp;H., Cao,&amp;nbsp;Y., Wu,&amp;nbsp;X., Qiu,&amp;nbsp;C., Gu,&amp;nbsp;J. et al. (2025). Towards realistic evaluation of cultural value alignment in large language models: Diversity enhancement for survey response simulation, Information Processing and Management, 62, 4. https://doi.org/10.1016/j.ipm.2025.104099 (In English).</mixed-citation></ref><ref id="B10"><mixed-citation>Malt,&amp;nbsp;B.&amp;nbsp;C and Majid,&amp;nbsp;A. (2013). How thought is mapped into words, Wiley Interdiscip Rev Cogn Sci. Nov; 4&amp;nbsp;(6), 583&amp;ndash;597. https://doi.org/10.1002/wcs.1251 (In English).</mixed-citation></ref><ref id="B11"><mixed-citation>Masoud,&amp;nbsp;R., Liu,&amp;nbsp;Z., Ferianc,&amp;nbsp;M., Treleaven,&amp;nbsp;P.&amp;nbsp;C. and Rodrigues,&amp;nbsp;M. (2025). Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede&amp;rsquo;s Cultural Dimensions, in Proceedings of the 31st International Conference on Computational Linguistics, 8474&amp;ndash;8503, Abu Dhabi, UAE, Association for Computational Linguistics. https://aclanthology.org/2025.coling-main.567/ (In English).</mixed-citation></ref><ref id="B12"><mixed-citation>Mikolov,&amp;nbsp;T., Chen,&amp;nbsp;K., Corrado,&amp;nbsp;G. and Dean,&amp;nbsp;J. (2013). Efficient estimation of word representations in vector space, in 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, 2013. https://doi.org/10.48550/arXiv.1301.3781 (In English).</mixed-citation></ref><ref id="B13"><mixed-citation>Mirko,&amp;nbsp;F. and Lavazza,&amp;nbsp;A. (2025), English in LLMs: The Role of AI in Avoiding Cultural Homogenization, in Philipp Hacker (ed.), Oxford Intersections: AI in Society (Oxford, online edn, Oxford Academic, 20 Mar. 2025). https://doi.org/10.1093/9780198945215.003.0140 (In English).</mixed-citation></ref><ref id="B14"><mixed-citation>Pistilli,&amp;nbsp;G., Leidinger,&amp;nbsp;A., Jernite,&amp;nbsp;Y., Kasirzadeh,&amp;nbsp;A., Luccioni,&amp;nbsp;A.&amp;nbsp;S. and Mitchell,&amp;nbsp;M. (2024). CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models.&amp;nbsp;Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society,&amp;nbsp;7&amp;nbsp;(1), 1132-1144. https://doi.org/10.1609/aies.v7i1.31710 (In English).</mixed-citation></ref><ref id="B15"><mixed-citation>Qin,&amp;nbsp;L., Chen,&amp;nbsp;Q., Zhou,&amp;nbsp;Y., Chen,&amp;nbsp;Z., Li,&amp;nbsp;Y., Liao, L., Li, M., Che, W., Yu, P. S.&amp;nbsp;(2025). A survey of multilingual large language models, Patterns, 6&amp;nbsp;(1), 101118. https://doi.org/10.1016/j.patter.2024.101118 (In English).</mixed-citation></ref><ref id="B16"><mixed-citation>Ruder,&amp;nbsp;S., Vulić,&amp;nbsp;I. and S&amp;oslash;gaard,&amp;nbsp;A. (2019). A survey of cross-lingual word embedding models, Journal of Artificial Intelligence Research, 65, 569&amp;ndash;631. https://doi.org/10.1613/jair.1.11640 (In English).</mixed-citation></ref><ref id="B17"><mixed-citation>Wendler,&amp;nbsp;C., Veselovsky,&amp;nbsp;V., Monea,&amp;nbsp;G. and West,&amp;nbsp;R. (2024). Do Llamas Work in English? On the Latent Language of Multilingual Transformers, in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume&amp;nbsp;1: Long Papers), 15366&amp;ndash;15394, Bangkok, Thailand. Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.acl-long.820 (In English).</mixed-citation></ref><ref id="B18"><mixed-citation>Wierzbicka,&amp;nbsp;A.&amp;nbsp;M. (1996). Semantics: Primes and Universals, Oxford University Press, UK. (In English).</mixed-citation></ref><ref id="B19"><mixed-citation>Xing,&amp;nbsp;C., Wang,&amp;nbsp;D., Liu,&amp;nbsp;C. and Lin,&amp;nbsp;Y. (2015). Normalized word embedding and orthogonal transform for bilingual word translation, in Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1006&amp;ndash;1011, Denver, Colorado. Association for Computational Linguistics. https://doi.org/10.3115/v1/N15-1104 (In English).</mixed-citation></ref></ref-list></back></article>