<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd">
<article article-type="research-article" dtd-version="1.2" xml:lang="ru" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><front><journal-meta><journal-id journal-id-type="issn">2313-8912</journal-id><journal-title-group><journal-title>Research Result. Theoretical and Applied Linguistics</journal-title></journal-title-group><issn pub-type="epub">2313-8912</issn></journal-meta><article-meta><article-id pub-id-type="doi">10.18413/2313-8912-2024-10-3-0-6</article-id><article-id pub-id-type="publisher-id">3545</article-id><article-categories><subj-group subj-group-type="heading"><subject>APPLIED LINGUISTICS</subject></subj-group></article-categories><title-group><article-title>&lt;strong&gt;ASAWEC: towards a corpus of Arab scholars&amp;rsquo; academic written English&lt;/strong&gt;</article-title><trans-title-group xml:lang="en"><trans-title>&lt;strong&gt;ASAWEC: towards a corpus of Arab scholars&amp;rsquo; academic written English&lt;/strong&gt;</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Sanosi</surname><given-names>Abdulaziz B</given-names></name><name xml:lang="en"><surname>Sanosi</surname><given-names>Abdulaziz B</given-names></name></name-alternatives><email>a.assanosi@psau.edu.sa</email><xref ref-type="aff" rid="aff1" /></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="ru"><surname>Mohammed</surname><given-names>Abuelgasim Sabah Elsaid</given-names></name><name xml:lang="en"><surname>Mohammed</surname><given-names>Abuelgasim Sabah Elsaid</given-names></name></name-alternatives><email>a.ibrahim@psau.edu.sa</email><xref ref-type="aff" rid="aff1" /></contrib></contrib-group><aff id="aff1"><institution>College of Science and Humanities, Prince Sattam bin Abdulaziz University, Hawtat Bani Tamim, Saudi Arabia</institution></aff><pub-date pub-type="epub"><year>2024</year></pub-date><volume>10</volume><issue>3</issue><fpage>0</fpage><lpage>0</lpage><self-uri content-type="pdf" xlink:href="/media/linguistics/2024/3/ВТиПЛ_2024_3_116-134.pdf" /><abstract xml:lang="ru"><p>Linguistic corpora have been used in a wide range in recent years. Different types of linguistics analyses in both spoken and written discourses are being conducted using the corpus linguistics approach. Among these, academic writing has received considerable attention. Corpus linguistics has provided insights into the academic writing of both native and non-native English language learners and writers in general. Nevertheless, relatively few studies have investigated this topic in the Arab EFL setting. Consequently, there is a relative paucity in corpora of Academic written English by Arab speakers. To address this gap, we compiled the Arab Scholars&amp;rsquo; Academic Written English Corpus (ASAWEC) which is a specialized corpus of Arab scholars&amp;rsquo; academic written English. We collected the corpus texts according to specific criteria, and then we normalized and cleaned the data. The texts were then tokenized and tagged and the corpus underwent initial tests which yields insightful findings on Arab scholars&amp;rsquo; academic written English such as the low lexical diversity and the utilization of various discourse techniques. The present paper introduces the corpus, provides details on its compilation, presents initial results and statistics, and discusses potential limitations and future perspectives for updating the corpus. It is envisaged that this project will encourage the use of the ASAWEC and help in launching similar initiatives to advance research in Arab corpus linguistics.</p></abstract><trans-abstract xml:lang="en"><p>Linguistic corpora have been used in a wide range in recent years. Different types of linguistics analyses in both spoken and written discourses are being conducted using the corpus linguistics approach. Among these, academic writing has received considerable attention. Corpus linguistics has provided insights into the academic writing of both native and non-native English language learners and writers in general. Nevertheless, relatively few studies have investigated this topic in the Arab EFL setting. Consequently, there is a relative paucity in corpora of Academic written English by Arab speakers. To address this gap, we compiled the Arab Scholars&amp;rsquo; Academic Written English Corpus (ASAWEC) which is a specialized corpus of Arab scholars&amp;rsquo; academic written English. We collected the corpus texts according to specific criteria, and then we normalized and cleaned the data. The texts were then tokenized and tagged and the corpus underwent initial tests which yields insightful findings on Arab scholars&amp;rsquo; academic written English such as the low lexical diversity and the utilization of various discourse techniques. The present paper introduces the corpus, provides details on its compilation, presents initial results and statistics, and discusses potential limitations and future perspectives for updating the corpus. It is envisaged that this project will encourage the use of the ASAWEC and help in launching similar initiatives to advance research in Arab corpus linguistics.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>Specialized corpus</kwd><kwd>Corpus compiling</kwd><kwd>Academic writing</kwd><kwd>Corpus linguistics</kwd><kwd>L2 writing</kwd></kwd-group><kwd-group xml:lang="en"><kwd>Specialized corpus</kwd><kwd>Corpus compiling</kwd><kwd>Academic writing</kwd><kwd>Corpus linguistics</kwd><kwd>L2 writing</kwd></kwd-group></article-meta></front><back><ref-list><title>Список литературы</title><ref id="B1"><mixed-citation>Akeel,&amp;nbsp;E.&amp;nbsp;S. (2014). A corpus-based study of modal verbs in academic writing of English native speakers and Saudis: Theses in pursue of academic degree of Master&amp;rsquo;s in Applied Linguistics, Reading, 72&amp;nbsp;p.</mixed-citation></ref><ref id="B2"><mixed-citation>Allan,&amp;nbsp;R., Shaw,&amp;nbsp;I. and Shaw,&amp;nbsp;M. (2023). Building a corpus of written tasks of Swedish national tests in English: Motivation, method, and research applications, Nordic Journal of English Studies, 22&amp;nbsp;(2), 128154. https://doi.org/10.35360/njes.821</mixed-citation></ref><ref id="B3"><mixed-citation>Almohizea, M. (2017). The compilation process of (COLTLC): A learner corpus, International Journal of Language and Linguistics, 4&amp;nbsp;(4), 223&amp;ndash;231.</mixed-citation></ref><ref id="B4"><mixed-citation>Alotaibi, H. (2017). Arabic-English parallel corpus: A new resource for translation training and language teaching, Arab World English Journal, 8&amp;nbsp;(3), 319&amp;ndash;337. https://dx.doi.org/10.24093/awej/vol8no3.21</mixed-citation></ref><ref id="B5"><mixed-citation>Anthony,&amp;nbsp;L. (2022). AntFileConverter (Version 2.0.2) [Computer Software], Waseda University, Japan, available at: https://www.laurenceanthony.net/software/antfileconverter/ (Accessed 07 October 2023)</mixed-citation></ref><ref id="B6"><mixed-citation>Anthony,&amp;nbsp;L. (2023). AntConc (Version 4.2.4) [Computer Software], Waseda University, Japan, available at: https://www.laurenceanthony.net/software (Accessed 07 October 2023)</mixed-citation></ref><ref id="B7"><mixed-citation>Atkins,&amp;nbsp;S., Clear,&amp;nbsp;J. and Ostler,&amp;nbsp;N. (1992). Corpus design criteria, Literary and Linguistic Computing, 7(1), 1&amp;ndash;16. https://doi.org/10.1093/llc/7.1.1</mixed-citation></ref><ref id="B8"><mixed-citation>Baker,&amp;nbsp;P. (2010). Sociolinguistics and corpus linguistics, Edinburgh University Press, Edinburgh, Scotland.</mixed-citation></ref><ref id="B9"><mixed-citation>Baker,&amp;nbsp;P., Hardie,&amp;nbsp;A. and McEnery,&amp;nbsp;T. (2006). A glossary of corpus linguistics, Edinburgh University Press, Edinburgh, Scotland.</mixed-citation></ref><ref id="B10"><mixed-citation>Bird,&amp;nbsp;S., Klein,&amp;nbsp;E. and Loper,&amp;nbsp;E. (2009). Natural language processing with python, O&amp;rsquo;Reilly Media, Inc.Sebastopol, CA, USA.</mixed-citation></ref><ref id="B11"><mixed-citation>Blecha,&amp;nbsp;J. (2012) Building specialized corpora: Thesis in pursue of the academic degree of Master&amp;rsquo;s in English Language and Literature, Masaryk, 159&amp;nbsp;p.</mixed-citation></ref><ref id="B12"><mixed-citation>Bodell,&amp;nbsp;M., Magnusson,&amp;nbsp;M. and Mutzel,&amp;nbsp;S. (2022). From documents to data: A framework for corpus quality, Scoius: Sociological Research for Dynamic World, 8, 1&amp;ndash;15. https://doi.org/10.1177/23780231221135523</mixed-citation></ref><ref id="B13"><mixed-citation>Brezina,&amp;nbsp;V. (2018). Statistics in corpus linguistics: A practical guide, Cambridge University Press, Cambridge, UK.</mixed-citation></ref><ref id="B14"><mixed-citation>Brezina,&amp;nbsp;V. and Platt,&amp;nbsp;W. (2023). #LancsBox X [Computer Software]. Lancaster University, available at: https://lancsbox.lancs.ac.uk/ (Accessed 07 October 2023)</mixed-citation></ref><ref id="B15"><mixed-citation>Collins,&amp;nbsp;L. (2019). Corpus linguistics for online communication, Routledge, London, UK.</mixed-citation></ref><ref id="B16"><mixed-citation>Cox,&amp;nbsp;C. and Newman,&amp;nbsp;J. (2020). Corpus annotation. In Paquot,&amp;nbsp;M. and Gries,&amp;nbsp;S. (eds.), A practical handbook of corpus linguistics, Springer, Cham, Switzerland, 25&amp;ndash;49.</mixed-citation></ref><ref id="B17"><mixed-citation>Crawford,&amp;nbsp;W. and Csomay,&amp;nbsp;E. (2016). Doing corpus linguistics, Routledge, London, UK.</mixed-citation></ref><ref id="B18"><mixed-citation>Darģis,&amp;nbsp;R., Auziņa,&amp;nbsp;I., Levāne-Petrova,&amp;nbsp;K. and Kaija,&amp;nbsp;I. (2020). Quality-focused approach to a learner corpus development, Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), Marseille, France, 392&amp;ndash;396. DOI: 10.13140/RG.2.2.13826.43207</mixed-citation></ref><ref id="B19"><mixed-citation>Dunn,&amp;nbsp;J. (2022). Natural language processing for corpus linguistics, Cambridge University Press, Cambridge, UK,</mixed-citation></ref><ref id="B20"><mixed-citation>Fuentes,&amp;nbsp;A. (2009). A case study corpus for academic English written by NNS authors, in G&amp;oacute;mez,&amp;nbsp;P. and P&amp;eacute;rez,&amp;nbsp;A. (eds.), A survey of corpus-based research, Spanish Association of Corpus Linguistic (AELINCO), Madrid, Spain, 1101&amp;ndash;1114.</mixed-citation></ref><ref id="B21"><mixed-citation>Gilquin,&amp;nbsp;G. (2020). Learner corpora, in Paquot,&amp;nbsp;M. and Gries,&amp;nbsp;S. (eds.), A practical handbook of corpus linguistics, Springer, Cham, Switzerland, 283&amp;ndash;304.</mixed-citation></ref><ref id="B22"><mixed-citation>Guerra,&amp;nbsp;J. and Smirnova,&amp;nbsp;E. (2023). How complex is professional academic writing? A corpus-based analysis of research articles in &amp;lsquo;hard&amp;rsquo; and &amp;lsquo;soft&amp;rsquo; disciplines, Vigo International Journal of Applied Linguistics (20), 149&amp;ndash;184. DOI: 10.35869/vial.v0i20.4357</mixed-citation></ref><ref id="B23"><mixed-citation>Hunston,&amp;nbsp;S. (2002). Corpora in applied linguistics, Cambridge University Press, Cambridge, UK. https://doi.org/10.1017/CBO9781139524773</mixed-citation></ref><ref id="B24"><mixed-citation>Jamalzadeh,&amp;nbsp;M. and Tabrizi,&amp;nbsp;H. (2020). Academic vocabulary in tourism research articles: A corpus-based study, Journal of Language and Discourse Practice, 1(2), 23&amp;ndash;42. DOI: 10.14744/ldpj.2020</mixed-citation></ref><ref id="B25"><mixed-citation>Kubler,&amp;nbsp;S. and Zinsmeister,&amp;nbsp;H. (2015). Corpus linguistics and linguistically annotated corpora, Bloomsbury Publishing, London, UK.</mixed-citation></ref><ref id="B26"><mixed-citation>Lemmenmeier-Batinić,&amp;nbsp;D. Spoken language corpora: Approaches for facilitating linguistic research, Dissertation in pursue of of Doctor of Linguistics. Zurich. 2023. 39 p.</mixed-citation></ref><ref id="B27"><mixed-citation>Liu, D. (2022). Using corpora for learning academic writing: A systematic review, The thirty-first International Symposium on English Language Teaching, English Teachers&amp;rsquo; Association-Republic of China (ETA-ROC), Taipei, Taiwan.</mixed-citation></ref><ref id="B28"><mixed-citation>McEnery,&amp;nbsp;T. and Wilson,&amp;nbsp;A. (2001). Corpus linguistics: An introduction, Edinburgh University Press, Edinburgh, Scotland.</mixed-citation></ref><ref id="B29"><mixed-citation>Meyer,&amp;nbsp;C. (2023). English corpus linguistics: An introduction, Cambridge University Press, Cambridge, UK.</mixed-citation></ref><ref id="B30"><mixed-citation>Sanosi,&amp;nbsp;A.&amp;nbsp;B. and Mohammed,&amp;nbsp;A. (2024). A corpus-based analysis of Arab scholars&amp;rsquo; use of interactional metadiscourse markers. International Journal of English Language and Literature Studies, 13(2), 188-200. https://doi.org/10.55493/5019.v13i2.5006</mixed-citation></ref><ref id="B31"><mixed-citation>Sanosi,&amp;nbsp;A.&amp;nbsp;B. (2022). The use and development of lexical bundles in Arab EFL writing: A corpus-driven study, Journal of Language and Education, 8&amp;nbsp;(2), 108&amp;ndash;123. https://doi.org/10.17323/jle.2022.10826</mixed-citation></ref><ref id="B32"><mixed-citation>Sanosi,&amp;nbsp;A.&amp;nbsp;B. and Mohammed,&amp;nbsp;A. (2024). A corpus-based analysis of Arab scholars&amp;rsquo; use of interactional metadiscourse markers. International Journal of English Language and Literature Studies, 13&amp;nbsp;(2), 188&amp;ndash;200. https://doi.org/10.55493/5019.v13i2.5006</mixed-citation></ref><ref id="B33"><mixed-citation>Sinclaire,&amp;nbsp;J. (1991). Corpus, concordance, collocation, Oxford University Press, Oxford, UK.</mixed-citation></ref><ref id="B34"><mixed-citation>Stefanowitsch,&amp;nbsp;A. (2020). Corpus linguistics: A guide to the methodology, Language Science Press, Berlin, Germany.</mixed-citation></ref><ref id="B35"><mixed-citation>Toriida,&amp;nbsp;M.-C. (2016). Steps for creating a specialized corpus and developing an annotated frequency-based vocabulary list, TESL Canada Journal, 34&amp;nbsp;(11), 87&amp;ndash;105. http://dx.doi.org/1018806/tesl.v34i1.1255</mixed-citation></ref><ref id="B36"><mixed-citation>Utkina,&amp;nbsp;T. (2021). Teaching academic writing in English to students of economics through conceptual metaphors. The Journal of Teaching English for Specific and Academic Purposes, 9&amp;nbsp;(4), 587&amp;ndash;599. https://doi.org/10.22190/JTESAP2104587U</mixed-citation></ref></ref-list></back></article>