Qualitative and quantitative specificity of corpus representation of part-of-speech categoriality in Russian and Chinese
Relevance. This article analyzes corpus-based representations of parts of speech in Russian and Chinese as an objective and relevant resource for linguistic research. The experience of generalization of a wide range of data and metadata is relevant both for understanding the theoretical foundations of language description and for practice-oriented use in applied interdisciplinary research. It can be productively implemented and further developed in multifaceted research. The scientific significance of the work is determined by the fact that a corpus-verified comparison of part-of-speech systems of typologically distant languages allows not only to clarify the statistical parameters of their functioning, but also to argue for the question of the linguistic validity of the discussion categories – the predicative in Russian and the differentiating word in Chinese.
Problems. Improving linguistic categories is equally relevant for both Russian and Chinese, as evidenced by the modernization of the parts-of-speech systems created for them in the context of corpus representation. Such data enables us to interpret part-of-speech as a feature that identifies and qualifies metalinguistic practice.
Methods. Data obtained through computer programs integrated into linguistic corpora enable us to overcome the commonplaceness and self-reflective nature of traditional linguistic descriptions of language using material from a fundamentally stable system of parts of speech. The study employed a comprehensive methodological approach, including qualitative and quantitative analysis and corpus-based technique.
Results. The feasibility of the metalinguistic positioning of part-of-speech categoricality as a linguistic universal is confirmed. Using data from the Russian National Corpus and the Online Corpus of Chinese, this article examines the qualitative and quantitative features of the functionality of linguistic units in the context of their part-of-speech affiliation.
Conclusions. The obtained corpus data on the units’ real-speech functioning stereotypeness of a particular part-of-speech cluster in typologically distinct languages convincingly demonstrates the feasibility of classifying parts of speech as a metalinguistic tool. In turn, the development of part-of-speech systems in Russian and Chinese is no less linguistically pressing, as evidenced by the urgency of identifying new categories of parts of speech and their actual incorporation into the corpus annotation. In this regard, it is significant that the linguistic validity of such categories as the predicative in Russian and the differentiating word in Chinese was confirmed based on corpus representation. The knowledge gained primarily relates to the high-frequency segment of the vocabulary of both languages. The obtained results can be applied in the practice of corpus annotation, comparative grammar and teaching Russian and Chinese as foreign languages.


















While nobody left any comments to this publication.
You can be first.