Список литературы

2313-8912

Research Result. Theoretical and Applied Linguistics

2313-8912

10.18413/2313-8912-2026-12-1-0-4

4105

APPLIED LINGUISTICS

<strong>Automatic keyphrase extraction and annotation: modern theoretical approaches and practical solutions for text and speech</strong>

Guseva

Daria D.

Guseva

Daria D.

daria.guseva@spbu.ru

Mitrofanova

Olga A.

Mitrofanova

Olga A.

o.mitrofanova@spbu.ru

Saint-Petersburg State University, Saint-Petersburg, Russia

2026

12100

The exponential growth of textual and audiovisual information has made the task of automatic keyphrase extraction (KE) increasingly significant. This article provides a comprehensive analysis of contemporary theoretical approaches and practical solutions for KE across both text and speech modalities. The primary contribution of this work is its systematic synthesis of these often-disparate research strands into a unified analytical framework, highlighting the evolution of the field from statistical methods towards large language models (LLMs) and end-to-end speech processing. We examine the stages of KE, the characteristics of keyphrases in written and spoken language, and terminological nuances. Various methods for automatic KE are discussed and analyzed in detail: statistical, hybrid, machine learning-based, and structural. The review dedicates substantial attention to emerging paradigms, including keyphrase generation using LLMs, and provides a detailed overview of methodologies and challenges in automatic corpus annotation. Furthermore, we specifically analyze current directions and inherent difficulties in KE for spoken language, comparing transcript-based and end-to-end acoustic approaches. This synthesis leads us to conclude that the field is moving towards a more integrated, context-aware paradigm. Future progress will depend on addressing key challenges such as data scarcity for low-resource languages, effective multimodal fusion, and the nuanced evaluation of generative KE systems.

Automatic keyphrase extractionSpoken language processingSpeech summarizationAutomatic annotationComputational linguisticsCorpus linguistics

This research was supported by Saint-Petersburg State University, project № 123042000068-8

Список литературы

Abramov, E. G. (2011). Selection of keywords for a scientific article, Nauchnaya periodika: problemy i resheniya, 2, 35–40. (In Russian)

Abrosimov, K. I. and Mosyagina, A. G. (2022). Sodner for Russian nested named entity recognition, Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, Moscow, Russia, June 15–18, 2022, 1–7. (In English)

Antipina, E. S. and Prokhorenkova, S. A. (2020). Modeling of creative linguistic personality on the material of romances on A. S. Pushkin’s poem “Do not sing, beauty, in my presence…”, Prepodavatel ХХI vek, 2 (2). (In Russian)

Augenstein, I., Das, M., Riedel, S., Vikraman, L., and McCallum, A. (2017). SemEval 2017 Task 10: ScienceIE — Extracting Keyphrases and Relations from Scientific Publications. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval–2017), 546–555, Vancouver, Canada. Association for Computational Linguistics. (In English)

BN, S., Shing, H.-C., Xu, L., Strong, M., Burnsky, J., Ofor, J., Mason, J.R., Chen, S., Srinivasan, S., Shivade, C., Moriarty, J., Cohen, J.P. (2025). Fact-Controlled Diagnosis of Hallucinations in Medical Text Summarization, Proceedings Interspeech 2025, Rotterdam, The Netherlands, August 17–21, 2025, 3070–3074. (In English)

Bolshakova, E. I., Klyshinskiy, E. S., Lande, D. V., Noskov, A. A., Peskova, O. V. and Yagunova, E. V. (2011). Avtomaticheskaya obrabotka tekstov na estestvennom jazyke i kompyuternaya lingvistika [Automatic processing of texts in natural language and computational linguistics: textbook], MIEM, Moscow, Russia. (In Russian)

Boudin, F. and Aizawa, A. (2025). An Analysis of Datasets, Metrics and Models in Keyphrase Generation. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), 973–973, Vienna, Austria and virtual meeting. Association for Computational Linguistics. (In English)

Brezina, V., McEnery, T. and Wattam, S. (2015). Collocations in context: A new perspective on collocation networks, International Journal of Corpus Linguistics, 20 (2). (In English)

Campos, R., Mangaravite, V., Pasquali, A., Jatowt, A., Jorge, A., Nunes, C. and Jatowt, A. (2020). YAKE! Keyword Extraction from Single Documents using Multiple Local Features, Information Sciences Journal, 509, 257–289. (In English)

Chen, P. I. and Lin, S. J. (2010). Automatic keyword prediction using Google similarity distance, Expert Systems with Applications, 37(3), 1928–1938. (In English)

Chen, W., Chan, H. P., Li, P. and King, I. (2020). Exclusive Hierarchical Decoding for Deep Keyphrase Generation, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, July 5, 2020, 1095–1105. (In English)

Chen, Y.-N., Huang, Y., Lee, H.-Y. and Lee, L.-S. (2012). Unsupervised two-stage keyword extraction from spoken documents by topic coherence and support vector machine, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, March 25–30, 2012, 5041–5044. (In English)

Devlin, J., Chang, M.-W., Lee, K. and Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. North American Chapter of the Association for Computational Linguistics. (In English)

Dostal, M. (2011). Automatic Keyphrase Extraction Based on NLP and Statistical Methods, Proceedings of the Dateso 2011: Annual International Workshop on Databases, Texts, Specifications and Objects, Pisek, Czech Republic, April 20, 2011, 140–145. (In English)

Dubinina, E. Yu. (2020). Extracting keywords of a scientific article text in the process of creating an automatic abstract, Vestnik VGU. Seriya: Filologiya. Zhurnalistika, 1, 26–28. (In Russian)

Evert, S. (2022). Measuring keyness, Digital Humanities 2022: Conference Abstracts, Tokyo, Japan, 25–29 July 2022, 202–205. (In Russian)

Freisinger, S., Seeberger, P., Ranzenberger, T., Bocklet, T. and Riedhammer, K. (2025). Towards Multi-Level Transcript Segmentation: LoRA Fine-Tuning for Table-of-Contents Generation, Proceedings Interspeech 2025, Rotterdam, The Netherlands, August 17–21, 2025, 276–280. (In Russian)

Gabrielatos, C. (2018). Keyness Analysis: nature, metrics and techniques, Corpus Approaches to Discourse: A critical review, Routledge, Oxford. (In English)

Gallina, Y., Boudin, F. and Daille B. (2019). KPTimes: A Large-Scale Dataset for Keyphrase Generation on News Documents, Proceedings of the 12th International Conference on Natural Language Generation, Association for Computational Linguistics, Tokyo, Japan, 130–135. (In English)

Glazkova, A., Morozov, D. (2024). Exploring fine-tuned generative models for keyphrase selection: A case study for Russian. DAMDID-2024. https://doi.org/10.1007/978-3-032-03997-2_7 (In English)

Glazkova, A., Morozov, D., Garipov, T. (2025). Key algorithms for keyphrase generation: Instruction-based LLMs for Russian scientific keyphrases. Analysis of Images, Social Networks and Texts, Springer Nature Switzerland, Cham, 107–119. (In English)

Glazkova, A., Morozov, D., Mitrofanova, O., Savchuk, S. (2024). Generation of keywords for regional media texts using large language models [Generatsiya klyuchevyh slov dlja tekstov regionalnyh SMI s pomoshchyu bolshih yazykovyh modeley]. International scientific conference dedicated to the 20th anniversary of the Russian National Corpus, The

V.V. Vinogradov Russian Language Institute of the Russian Academy of Sciences, Moscow, Russia, 37–39. (In English)

Gong, Z., Ai, L., Deshpande, H., Johnson, A., Phung, E., Wu, Z., Emami, A. and Hirschberg, J. (2025). Comparison-Based Automatic Evaluation for Meeting Summarization, Proceedings Interspeech 2025, Rotterdam, The Netherlands, August 17–21, 2025, 291–295. (In English)

Grineva, M. and Grinev, M. (2009). Analysis of text documents for extracting thematically grouped key terms, Trudy Instituta sistemnogo programmirovaniya RAN, 16, 155–165. (In Russian)

Grootendorst, M. (2020). KeyBERT: Minimal Keyword Extraction with BERT [Online], available at: http://doi.org/10.5281/

zenodo.4461265 (Accessed 11.10.2025). (In English)

Grudeva, E. V. and Churilina, L. N. (2019). Retelling as a secondary text: linguistic and methodological potential, Magnitogorskiy gosudarstvennyy tekhnicheskiy universitet im. G.I. Nosova, Magnitogorsk, Russia. (In Russian)

Grudeva, E. V. and Gubushkina, A. A. (2020). Selection of keywords and oral retelling as secondary texts (based on the secondary speech activity of 6th grade students), Vestnik Cherepovetskogo gosudarstvennogo universiteta, 2 (95). (In Russian)

Gulyaev, O. V. and Lukashevich, N. V. (2013). Automatic classification of texts based on section heading, Novyye informatsionnyye tekhnologii v avtomatizirovannykh sistemakh, 16, 238–244. (In Russian)

Guseva, D., Mitrofanova, O. and Dolgushin, M. (2025). Human and Machine Keyphrase Perception in Russian Text and Speech, Speech and Computer: 26th International Conference, SPECOM 2024, Belgrade, Serbia, November 25–28, 2024, 265–280. (In English)

Hasan, K. and Ng, V. (2014). Automatic Keyphrase Extraction: A Survey of the State of the Art, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, June 23 –25, 2014, Baltimore, Maryland, USA, 1262–1273. (In English)

Hulth, A. (2003). Improved Automatic Keyword Extraction Given More Linguistic Knowledge. In Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, 216–223. (In English)

Jacquemin, C. and Bourigault, D. (2003). Term extraction and automatic indexing, Handbook of Computational Linguistics, Oxford University Press, 599–615. (In English)

Jones, K. S. (1972). A statistical interpretation of term specificity and its application in retrieval, Journal of Documentation, 28 (1), 11–21. (In English)

Kamshilova, O. N. (2013). Small forms of scientific text: keywords and abstract (informational aspect), Izvestiya Rossiyskogo Gosudarstvennogo pedagogicheskogo universiteta im. A.I. Gertsena, 156, 106–117. (In Russian)

Kano, T., Ogawa, A., Delcroix, M., Fukuda, R., Chen, W., Watanabe, S. (2025). Pick and Summarize: Integrating Extractive and Abstractive Speech Summarization, Proceedings Interspeech 2025, Rotterdam, The Netherlands, August 17–21, 2025, 281–285. (In English)

Kilgarriff, A. (2009). Simple maths for keywords, Proceedings of the Corpus Linguistics 2009 Conference, Liverpool, UK, July 20–23, 2009, 1–6. (In English)

Kodzasov, S. V. and Krivnova, O. F. (2001). Obshchaya fonetika [General phonetics], Moscow, Russia. (In Russian)

Koloski, B., Pollak, S., &Scaron;krlj, B. and Martinc, M. (2021). Extending Neural Keyword Extraction with TF-IDF tagset matching, Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, April, 2021, 22–29. (In English)

Kotey, S., Dahyot, R. and Harte, N. (2023). Query Based Acoustic Summarization for Podcasts, Proc. Interspeech 2023, Dublin, Ireland, August 20–24, 2023, 1483–1487. (In English)

Krapivin, M., Autayeu, A., Marchese, M., Blanzieri, E., Segata, N. (2010). Keyphrases Extraction from Scientific Documents: Improving Machine Learning Approaches with Natural Language Processing. In: Chowdhury, G., Koo, C., Hunter, J. (eds) The Role of Digital Libraries in a Time of Global Change, ICADL 2010, Lecture Notes in Computer Science, vol. 6102, Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13654-2_12 (In English)

Krasavina, V. D. and Mirzagitova, A. R. (2015). Optimization of search in LeadScanner system using automatic extraction of keywords and word combinations, Trudy mezhdunarodnoy konferentsii “Korpusnaya lingvistika–2015”, St. Petersburg, Russia. (In Russian)

Kroll, M. and Kraus, K. (2024). Optimizing the role of human evaluation in LLM-based spoken document summarization systems, Proc. Interspeech 2024, Kos Island, Greece, September 1–5, 2024, 1935–1939. (In English)

Le-Duc, K., Nguyen, K.-N., Vo-Dang, L. and Hy, T.-S. (2024). ‘Real-time Speech Summarization for Medical Conversations’, Proc. Interspeech 2024, Kos Island, Greece, September 1–5, 2024, 1960–1964. (In English)

Lee, H.-y., Shiang, S.-R., Yeh, Ch.-F., Chen, Y.-N., Huang, Y., Kong, S.-y. and Lee, L.-S. (2014). Spoken Knowledge Organization by Semantic Structuring and a Prototype Course Lecture System for Personalized Learning, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22, 883–898. (In English)

Lee, W., Chun, M., Jeong, H., Jung, H. (2023). Toward keyword generation through large language models. Companion Proceedings of the 28th International Conference on Intelligent User Interfaces, 37–40.  https://doi.org/10.1145/

3581754.3584126 (In English)

Litvak, M. (2013). DegExt: A language-independent keyphrase extractor, Journal of Ambient Intelligence and Humanized Computing, 4, 377–387. (In English)

Luhn, H.P. (1957). A Statistical Approach to Mechanized Encoding and Searching of Literary Information, IBM Journal of Research and Development, 1 (4), 309–317. (In English)

Martínez‑Cruz, R., López‑López, A. J., and Portela, J. (2023). ChatGPT vs state‑of‑the‑art models: a benchmarking study in keyphrase generation task. arXiv preprint, arXiv:2304.14177. (In English)

Martínez-Cruz, R., López-López, A. J., Portela, J. (2025) ChatGPT vs state-of-the-art models: Abenchmarking study in keyphrase generation task, Applied Intelligence, 55 (1), 50. (In English)

Marujo, L., Gershman, A., Carbonell, J., Frederking, R., and Neto, J. P. (2013). Supervised topical key phrase extraction of news stories using crowdsourcing, light filtering and co‑reference normalization. arXiv preprint, arXiv:1306.4886. (In English)

Maskey, S. and Hirschberg, J. (2005). Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization, Proc. Interspeech 2005, Lisbon, Portugal, September 4 – 8, 2005, 621–624. (In English)

Matsuura, K., Ashihara, T., Moriya, T., Mimura, M., Kano, T., Ogawa, A. and Delcroix, M. (2024). Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation, Proc. Interspeech 2024, Kos Island, Greece, September 1–5, 2024, pp. 1945–1949. (In English)

Matsuura, K., Ashihara, T., Moriya, T., Tanaka, T., Kano, T., Ogawa, A. and Delcroix, M. (2023). Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization, Proc. Interspeech 2023, Dublin, Ireland, August 20–24, 2023, 2943–2947. (In English)

Meng, R., Zhao, S., Han, S., He, D., Brusilovsky, P. and Chi, Y. (2017) Deep Keyphrase Generation, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Vancouver, Canada, 582–592. (In English)

Meng, R., Zhao, S., Han, S., He, D., Brusilovsky, P., and Chi, Y. (2017). Deep Keyphrase Generation, in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 582–592, Vancouver, Canada. Association for Computational Linguistics. (In English)

Mihalcea, R. TextRank: Bringing order into texts, in Proc. EMNLP, 2004, 4. 404–411. (In English)

Mijić, J., Dalbelo Ba&scaron;ić, B., &Scaron;najder, J. (2010). Robust Keyphrase Extraction for a Largescale Croatian News Production System, in Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP-2010), 59–99. (In English)

Mitrofanova, O. A. and Gavrilik, D. A. (2022). Experiments on automatic extraction of key expressions in stylistically diverse corpora of Russian texts, Terra Linguistica, 13 (4), 22–40. (In Russian)

Morozov, D. A., Glazkova, A. V., Tyutyul’nikov, M. A. and Iomdin, B. L. (2023). Generation of keywords for annotations of Russian scientific articles, Vestnik NSU. Seriya: Lingvistika i mezhkul’turnaya kommunikatsiya, 1. (In Russian)

Moskvina, A. D., Erofeeva, A. R., Mitrofanova, O. A. and Kharabet, Ya. K. (2017). Automatic extraction of keywords and word combinations from Russian corpus of texts using RAKE algorithm, Trudy Mezhdunarodnoy konferentsii “Korpusnaya lingvistika–2017” (Sankt-Peterburg, 27–30 iyunya 2017 g.), Izd-vo SPbGU, Russia, 268–275. (In Russian)

Moskvina, A., Sokolova, E. and Mitrofanova, O. (2018). KeyPhrase extraction from the Russian corpus on Linguistics by means of KEA and RAKE algorithm, Data Analytics and Management in Data Intensive Domains: XX International Conference DAMDID/RCDL’2018: Conference Proceedings, Moscow, Russia, October 9–12, 2018, 369–372. (In English)

Moskvitina, T. N. (2009). Keywords and their functions in scientific text, Vestnik ChGPU, 11, 270–283. (In Russian)

Moskvitina, T. N. (2018). Methods for extracting keywords when abstracting a scientific text, Vestnik Tomskogo gosudarstvennogo universiteta, 8, 45–50. (In Russian)

Nguyen, T. D., Kan, M-Y. (2007). Keyphrase Extraction in Scientific Publications. In: Goh, D. H-L., Cao, T. H., Sølvberg, I. T., Rasmussen, E. (eds) Asian Digital Libraries. Looking Back 10 Years and Forging New Frontiers, ICADL 2007, Lecture Notes in Computer Science, vol. 4822, Springer, Berlin, Heidelberg. P. 317–326. https://doi.org/10.1007/

978-3-540-77094-7_41 (In English)

Papusha, I. S. (2008). Complex syntactic whole: keywords or herms, Vestnik Assotsiatsii VUZov turizma i servisa, 3, 48–54. (In Russian)

Piotrovskiy, R. G., Bektaev, K. B. and Piotrovskaya, A. A. (1977). Matematicheskaya lingvistika: ucheb. posobiye dlya ped. institutov [Mathematical linguistics: textbook for pedagogical institutes], Vysshaya shkola, Moscow, Russia. (In Russian)

Popova, S. V. and Danilova, V. V. (2014). ‘Representation of documents in the task of clustering scientific text annotations’, Nauchno-tekhnicheskiy vestnik informatsionnykh tekhnologiy, mekhaniki i optiki, 1 (89), 99–107. (In Russian)

Riedhammer, K., Favre, B. and Hakkani-Tur, D. (2010). Long story short–global unsupervised models for keyphrase based meeting summarization, Speech Communication, 52 (10), 801–815. (In English)

Rose, S. J., Cowley, W. E., Crow, V. L. and Cramer, N. O. (2009). Rapid Automatic Keyword Extraction for Information Retrieval and Analysis, Text Mining: Applications and Theory, 1–20. (In English)

Ryu, S., Do, H., Kim, Y., Lee, G.G. and Ok, J. (2024). Key-Element-Informed sLLM Tuning for Document Summarization, Proc. Interspeech 2024, Kos Island, Greece, September 1–5, 2024, 1940-1944. (In English)

Sakharnyy, L. V. (1982). Actual division and text compression (on the use of informatics methods in psycholinguistics, Teoreticheskiye aspekty derivatsii, Perm, Russia. (In Russian)

Sakharnyy, L. V. and Shtern, A. S. (1988). Selection of keywords as a type of text, Leksicheskiye aspekty v sisteme professional’no-orientirovannogo obucheniya inoyazychnoy rechevoy deyatel’nosti, Perm, Russia, 34–51. (In Russian)

Shang, H., Li, Z., Guo, J., Li, S., Rao, Z., Luo, Y., Wei, D. and Yang, H. (2024). An End-to-End Speech Summarization Using Large Language Model, Proc. Interspeech 2024, Kos Island, Greece, September 1–5, 2024, 1950–1954. (In English)

Shao, L., Zhang, L., Peng, M., Ma, G., Yue, H., Sun, M., and Su, J. (2024). One2set+ large language model: Best partners for keyphrase generation, in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 11140–11153. (In English)

Shekhtman, N. A. (2005). Ponimaniye rechevogo proizvedeniya i gipertekst [Understanding of a speech work and hypertext], Izd-vo OGPU, Orenburg, Russia. (In Russian)

Sheremetyeva, S. O. and Osminin, P. G. (2015). Methods and models of automatic keyword extraction, Vestnik YUrGU. Seriya “Lingvistika”, 12 (1), 76–81. (In Russian)

Sokolova, E. V. and Mitrofanova, O. A. (2018). Automatic extraction of keywords and word combinations from Russian texts using KEA algorithm, Kompyuternaya lingvistika i vychislitel’nyye ontologii, 157–165. (In Russian)

Song, M., Geng, X., Yao, S., Lu, S., Feng, Y., Jing, L. (2023). Large language models as zero-shot keyphrase extractor: A preliminary empirical study. arXiv preprint arXiv:2312.15156 (In English)

Song, M., Jiang, H., Shi, S., Yao, S., Lu, S., Feng, Y., Liu, H., and Jing, L. (2023). Is ChatGPT a good keyphrase generator? A preliminary study. arXiv preprint, arXiv:2303.13001. (In English)

Sterckx, L., Demeester, T., Deleu, J., et al. (2018). Creation and evaluation of large keyphrase extraction collections with multiple opinions. Language Resources & Evaluation, 52, 503–532. (In English)

Su, J., Zhang, L., Hassanzadeh, H. R. and Schaaf, T. (2022). Extract and Abstract with BART for Clinical Notes from Doctor-Patient Conversations, Proc. Interspeech 2022, Incheon, Korea, September 18–22, 2022, 2488–2492. (In English)

Svetozarova, N. D. and Shtern, A. S. (1989). Key and phonematically highlighted words of the text, Eksperimentalnaya fonetika, Moscow, Russia, 157–170. (In Russian)

Troshina, A. (2025). Text Preprocessing for Keyword and Key Phrase Extraction. In: Bakaev, M., et al. Internet and Modern Society. Human-Computer Communication. IMS 2024. Communications in Computer and Information Science, vol 2534. Springer, Cham, 105–112. https://doi.org/10.1007/978-3-031-96177-9_9  (In English)

Tsarfaty, R., Seddah, D., Kübler, S. and Nivre, J. (2013). Parsing Morphologically Rich Languages: Introduction to the Special Issue, Computational Linguistics, 3 9(1), 15–22. (In English)

Umair, M., Sultana, T., and Lee, Y.-K. (2024). Pre-trained language models for keyphrase prediction: A review. ICT Express, 10(4), 871–890. (In English)

Umair, M., Sultana, T., Lee, Y. K. (2024) Pre-trained language models for keyphrase prediction: A review. ICT Express, 10 (4), 871–890. https://doi.org/10.1016/j.icte.2024.05.015 (In English)

Vanyushkin, A. S. and Grashchenko, L. A. (2016). Methods and algorithms for extracting keywords, Novyye informatsionnyye tekhnologii v avtomatizirovannykh sistemakh, 19, 85–93. (In Russian)

Vanyushkin, A. S. and Grashchenko, L. A. (2017). Evaluation of keyword extraction algorithms: tools and resources, Novyye informatsionnyye tekhnologii v avtomatizirovannykh sistemakh, 20. (In Russian)

Vanyushkin, A. S. and Grashchenko, L. A. (2018). On the marking of text corpora with keywords, Novyye informatsionnyye tekhnologii v avtomatizirovannykh sistemakh, 21, 207–211.

Vartakavi, A. and Garg, A. (2020). Podsumm: Podcast audio summarization [Online], available at: https://arxiv.org/pdf/2009.10315  (Accessed 3.11.2025). (In English)

Vasileva, V. V. and Kon’kov, V. I. (2015). Ustnaya rech: praktikum [Spoken speech: workshop], S.-Peterb. gos. un-t, St. Petersburg, Russia. (In Russian)

Vinogradova, N. V. and Ivanov, V. K. (2016). ‘Modern methods of automated extraction of keywords from text’, Informatsionnyye resursy Rossii, 4, 13–18. (In Russian)

Wan, X. and Xiao, J. (2008). Single document keyphrase extraction using neighborhood knowledge. In Proceedings of the 23rd National Conference on Artificial Intelligence, volume 2, AAAI Press, 855–860. (In English)

Wang, J. (2022). ESSumm: Extractive Speech Summarization from Untranscribed Meeting, Proc. Interspeech 2022, Incheon, Korea, September 18–22, 2022, 3243–3247. (In English)

Wang, S., Dai, S., and Jiang, J. (2024). Thinking like an author: A zero‑shot learning approach to key phrase generation with large language model. In A. Bifet, J. Davis, T. Krilavičius, M. Kull, E. Ntoutsi, and I. Žliobaitė (Eds.), Machine Learning and Knowledge Discovery in Databases. Research Track. Springer Nature Switzerland, Cham, 335–350. (In English)

Wienecke, Y. (2020). Automatic Keyphrase Extraction from Russian-Language Scholarly Papers in Computational Linguistics, University Honors Theses, Portland State University. (In English)

Wilson, A. (2013). Embracing Bayes factors for key item analysis in corpus linguistics, New approaches to the study of linguistic variability. Language Competence and Language Awareness in Europe, 4, 3–11. (In English)

Xiong, L., Chuan Hu, Chenyan Xiong, Campos D., and Overwijk, A. (2019). Open Domain Web Keyphrase Extraction Beyond Language Modeling. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5175–5184, Hong Kong, China. Association for Computational Linguistics. (In English)

Xiong, L., Hu, C., Xiong, C., Campos, D. and Overwijk, A. (2019). Open Domain Web Keyphrase Extraction Beyond Language Modeling. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, 5175–5184. (In English)

Yagunova, E. V. (2004). The role of keywords in the perception of spoken and written text (based on the Russian language), Chelovek pishushchiy i chitayushchiy: problemy i nablyudeniya: Materialy i nablyudeniya: Materialy mezhdunarodnoy konferentsii 14-16 marta 2002 g. Sankt-Peterburg, Izd-vo SPbGU, Russia, 197–204. (In Russian)

Yagunova, E. V. (2010). ‘Experiment and calculations in the analysis of keywords of a fictional text’, Filosofiya yazyka. Lingvistika. Lingvodidaktika, 1, 83–89.

Zakharov, V. P. and Khokhlova, M. V. (2014). ‘Extraction of terminological phrases from special texts based on various association measures’, XVII Vserossiyskaya obyedinennaya konferentsiya “Internet i sovremennoye obshchestvo” (IMS-2014), St. Petersburg, Russia. (In Russian)

Zhang, C., Wang, H., Liu, Y., Wu, D., Liao, Y. and Wang, B. (2008). Automatic keyword extraction from documents using conditional random fields, Journal of Computational Information Systems, 4(3), 1169–1180. (In English)