Development dynamics and cognitive-semantic parameters of English ditransitive construction: verification from the perspective of corpus linguistics


The relevance of the paper is substantiated by the representation of the analysis of utterances generated on the template of ditransitive construction via the methods of cognitive and corpus linguistics. The set of issues raised by the authors consists in the necessity of looking out for valid reasons and sufficient evidence which confirm that analytical tools of linguistic corpus possess all the necessary resources and effective toolkit to single out the true phenomena of language. The major objective of the research is to explore the potential of corpus linguistics technologies as exemplified by the representative sample of empirical data via quantitative, statistical and collexeme analysis of utterances with ditransitive construction which correlate with its propositions and the scenarios derived from them. The above mentioned methods of corpus-based analysis shed light on the primary scenarios, which motivate ditransitive construction semantic representations; the dynamics of its semantic extension from diachronic perspective; the way ditransitive construction develops in the synchronic aspect; quantitative index of verbs accepted by the construction;  semantic parameters of lexical units which objectivize agent, recipient and patient arguments of construction. Relying on the results obtained due to the processing of vast linguistic data from diachronic and synchronic perspective the authors unleash the potential of corpus linguistics technologies and substantiate advantages of their use as a means for verification and extension of information about language facts which emerge within the paradigm of cognitive linguistics theoretical and methodological background.


Corpus linguistics developed into a separate discipline within the framework of Language Studies during the last decades of the previous century. Its major objective is to collect and analyze texts in order to create a corpus of some natural language with the appropriate set of analytical tools for the implementation of quantitative and qualitative analysis of the linguistic data which it contains.

Let us turn to the characteristics of corpus linguistics in order to substantiate the validity and relevance of this methodology in relation to the empirical material used in the current research. The linguistic corpus can be defined as “a collection of texts assembled in accordance with clearly formulated principles and possibly annotated at some level of linguistic analysis” (Sharov, 2003: 11). J. Sinclair describes corpus as “a collection of excerpts from texts in electronic form, selected according to some external criteria for better representation of language and its variations. Corpus functions as a data source for linguistic research” (Sinclair, 1991). T. McEnery and A. Wilson define corpus as a most exhaustive nonrandom collection of linguistic utterances compiled in a way which allows to highlight the peculiarities of a certain language, variety of its literary styles, types of texts, etc. (McEnery, Wilson, 2001 : 75). According to D. Bayber, representativeness of sample means the degree to which the sample reflects the variability of the plurality, i.e. a sample is considered representative if the data obtained from the analysis of its contents can be extrapolated to the general sample (Biber, 1993: 243). This makes the samples no larger and no less than a "reduced version of a large plurality" (McEnery, Wilson, 2001: 19) because it has the same properties and proportions like that of a larger plurality.

Thus, corpora are finite samples, limited both in size and in the purpose of creation. Strictly speaking, no corpus can adequately represent the language phenomena. This happens to be the reason for the criticism addressed to corpus linguistics by some researchers. However, the method of linguistic introspection is far from being sufficient either especially when it comes to verification of research hypotheses. As it is stated by Greenbaum, a linguist using only his intuition is also unable to create an exhaustive selection of examples relevant to this case (Greenbaum, S., Eckman F.R., 1977: 128). Moreover, one should not forget that “a linguistic theory that can explain examples of a person's knowledge of a language is preferable to one that is not able to do this” (Wasow, 2002: 130).

Consequently, the main advantage of corpus linguistics is that it freed linguists from attachment to their own, imperfect and incomplete linguistic intuition as the only source of linguistic information. Within a relatively short period of time, a large number of authentic, systematically organized examples of language use have become available (Ozon, 2009). The undoubted advantage of corpus linguistics is that its technologies provide researchers with an opportunity to analyze linguistic material both diachronically and synchronically comparing and contrasting the obtained results. For example, T. Fanego (Fanego, 1996; Fanego, 1997) and T. Egan (Egan, 2003), using the methods of corpus linguistics, carried out a quantitative analysis of the distribution of gerund and infinitive forms in diachronic and synchronic aspects. Two constructions [remember + to + have + V-ed] and [remember + V-ing] were chosen for the analysis; the material was selected from several corpora of the English language, including The Collins Cobuild Corpus (CCB). It was found out that during the period under review (1770 till the present time) gerund totally replaced infinitive (Fanego, 1996). Interestingly, the period from 1780 to 1850 was marked with the prevailing use of retrospective verbs followed by the infinitive form of perception verbs, but by the beginning of the 20th century they were completely displaced with collocations containing gerund (Dzhandubaeva, 2015).

At the current stage, corpus linguistics has made it possible to verify the results of linguistic research and draw conclusions relying on a vast array of empirical data under analysis (Rykov, 2012). The applied value of linguistic corpora is also determined by the variety of sophisticated tools which give us the opportunity not only to save time gathering the required data but to process it from different standpoints and visualize the obtained observations.    

Thus, technologies included into the toolkit of an average corpus allow researchers (1) to measure the representativeness of linguistic units under analysis; (2) carry out graphemic analysis of the material, its normalization and lemmatization (compilation of lists of units in which the grammatical forms of a word are shown as one word); (3) view all contextual actualizations of a linguistic unit on the extensive array of the corpus using various options for sorting words to the right or to the left of the given speech unit (concordance); (4) carry out other quantitative studies of the material determining the number of word forms (types) and word usages (tokens); calculate an average sentence length, the number of sentences and their possible distributions; estimate the exclusivity index (percentage of words that were used only once) and the index of constancy (percentage of the most frequent words); (5) compare linguistic units on account of a certain key or distinction feature; (6) systematize the data under analysis in accordance with its genre classification; (7) select and analyze linguistic units via a semantic (this type of markup assigns to language units one or more features expressed through semantic primitives such as “thing”, “event”, “space”, etc.) and / or syntactic markup (this markup involves distinguishing sentence constituents and derivational dependencies in order to resolve the problem of grammatical homonymy).

It is also the quantitative corpus analysis that allows a linguist to generalize information into a large plurality, "to determine which phenomena are most likely a true reflection of the language or its variant, and which are just coincidences" (McEnery, Wilson, 2001: 76). Various statistical analysis techniques are used to conduct rigorous research into complex and challenging data. According to K. Johnson, quantitative analysis is carried out for the following purposes (Johnson, 2008 : 3) : (1) information processing: summarize trends, identify similar aspects of a set of observations such as average number, average deviation, interdependence among variables; (2) conclusion: generalizing a representative set of observations to a larger set of possible observations using hypothesis testing criteria such as Student's t-test or Analysis of variance (ANOVA); (3) link discovery: find descriptive or causal examples in the data that can be described in multiple regression models or factor analysis; (4) study into the processes that may have probabilistic basis: theoretical modeling, for example, in information theory or for practical purposes, for example, probabilistic parsing of sentences.

Materials and Methodology

The major objective of the paper is to uncover the potential of corpus linguistics technologies and to confirm the justification of their use in the cognitive-semantic analysis of language facts using the examples of speech units with the ditransitive construction.

Academic papers created within the paradigm of cognitive linguistics (Talmy, 2007; Kubriakova, 2012; Manerko, 2017), theory of construction grammar (Fillmore, Kay, 1999; Goldberg, 2010; Jackendoff, 2015; Naumenko, E. E., Kosinets, I. I., Avanesyan, N. K., Golets, V. A., Daramilova, Z. A.-G.  (2021); Rakhilina, 2000; 2010; 2017; Tishchenko, 2004; 2016; Klepikova, 2008; Dobrovolsky, 2016, Makoeva, 2018) and corpus linguistics (McEnery,Wilson, 2001; Ozon, 2009; Sharov, 2003; Rykov, 2012)  serve as the theoretical and methodological framework of the current research.

As a result of the cognitive-semantic analysis of speech units with the ditransitive construction, presented in the work by D. Makoeva (Makoeva, 2018), it was found that at the conceptual level, the basic proposition for all nonprepositional ditransitive constructions can be expressed as [X INTERACTS WITH Y VIA Z]. This scheme conceptualizes interaction between two animate entities via some physical (usually inanimate) object in the result of which (in most cases) this object is transferred and / or moved to the recipient. The corpus of statements with the ditransitive construction falls into several subgroups which are associated with above given proposition but have a more specific character. 

The event of material object transfer or the event of control transfer (OBJECT / CONTROL TRANSFER) along with metaphorical instantiations of the physical object transfer (e.g. the construction-based units of speech describing the transfer of information) are verbalized via utterances with the proposition [ CAUSES Y TO RECEIVE Z]: (1) They give you furniture, too (COCA); (OBJECT TRANSFER); (2) She gives them assembly halls, sleeping quarters... (TM) (CONTROL TRANSFER); (3) Well, they promised us coverage in Panama (COCA) (FUTURE TRANSFER); (4) Goebel goes on to describe a luncheon at which he read her his letter (COCA) (COMMUNICATION).

The transfer of an action (ACTION TRANSFER) is expressed in speech through the utterances which belong to the semantic class of CAUSATION and can have the proposition [X CAUSES Y BECOME Z]: (5) This ... gives Gorbachev the option to “move quickly toward a market economy” [TM]; (6) Robertson gives him increasing license to preach as well as plan (TM) or the proposition [X CAUSES Y FACE / DEAL WITH Z]: (7) He also gives James (winningly played by Paul Terry) a mission (TM). The transfer of action is also objectivized in speech units from the semantic class BENEFICIAL ACTIVITY with the preposition [X PERFORMS ACTIVITY (Z) FOR BENEFIT OF Y]: (8) She cooked them lasagna (COCA); (9) We offer you childbirth without pain, stretch marks, and morning sickness (COCA) and the semantic class SCHEMATIC INTERACTION with the proposition [X DIRECTS ACTION (Z) AT Y]: (10) Clinton gives him a bear hug (COCA); (11) Safiy shot her an anxious look (COCA).

As it has been mentioned above, corpus linguistics happens to be the set of effective tools to verify the results of introspective linguistic analysis. That is why in the current research we rely on the complex approach to the analysis of utterances based on a nonprepositional ditransitive construction. Such approach is based on theories and methods of cognitive semantics and construction grammar enhanced by the technologies of corpus linguistics. The empirical data for the research (31,066 examples) was obtained from the diachronic corpus (Early English Books Online) and corpora of modern English (The Corpus of Contemporary American English, The Time Magazine Corpus, The British National Corpus). In this paper, we argue that the "symbiosis" of the corpus-based and cognitive-semantic analysis will make it possible to find out which conceptual scenarios underlying and determining the semantics of ditransitive construction instantiations are the primary ones, what linguistic means are used for their objectivation and how the dynamics of their representation in the language have been changing.

Results and Discussion

The cognitive semantics and construction grammar toolkit has proved itself to work as an effective method of conceptualization and categorization of lexical and grammatical units from any natural language. Ditransitive construction is not an exception. At the same time, the cognitive linguistics paradigm does not allow to explicate the aspects of its meaning incrementation extended through different time periods. That is why diachronic corpus-based analysis of 8172 utterances from The Early English Books Online Corpus (EEBO) is employed in the current study to ascertain when cognitive scenarios of ditransitive construction went into interpersonal verbal communication of native speakers. The EEBO corpus covers the Early Modern English period (1470 - 1690) and contains 755 million words.

The empirical data selected from the corpus were ditransitive construction based utterances with the syntactic template [Subj (Subject Pronouns) + V (Past Tense) + Obj (Object Pronouns) + Obj2 (Noun Phrase)]. The position of the verb in the utterances which went under our scrutiny was filled by object-spatial verbs, the semantics of which implies the movement of the transferred object in space (gave, brought, handed, passed); verbs signifying the event of control transfer (left); verbs describing some activity performed by the giver for the recipient (made, poured, won); speech (communicative) verbs objectivizing information delivery or the prospect of providing somebody with something (offered, promised, told). The outcome of the corpus-based quantitative analysis is shown in Table 1:


Table 1. Quantitative analysis of semantic representations of ditransitive construction in the Early English period























Among speech units with nonprepositional ditransitive construction registered in The Early English Books Online Corpus covering the literary heritage of Great Britain within the period from 1470 (Middle English) to 1690 (New English), the most recurrent units were the utterances of the semantic class COMMUNICATION – 3150 cases of use (38.6%): I give thee thankes o father (EEBO).

The total number of examples of speech units with the ditransitive construction of BENEFICIAL ACTIVITY semantic class is 2707 (33,1 %): it brings him money and honour (EEBO). 

Speech units of the semantic class MATERIAL OBJECT / CONTROL TRANSFER are rated as the third concerning their representativeness – 1627 instantiations (20 %): …did he bring them water (EEBO).

They are followed by the speech units belonging to the semantic class CAUSATION – 353 (4,3 %): … they give them head and suffer (EEBO).

Utterances based on FUTURE TRANSFER scenario estimate 300 (3,6%) instantiations: God promised them peace (EEBO); … he offered people slavery (EEBO).

The least frequent scenario is  SCHEMATIC INTERACTION – 35 (0.4 %): … we bring you arms offensive and defensive (EEBO):

Fig. 1. Semantic representations of ditransitive construction in the Early English period


In addition to the quantitative analysis, a statistical research of the obtained corpus data was carried out. The results of it are presented below in the graph Summary statistics of ditransitive construction extension trend”. To visualize ditransitive construction semantic meaning development dynamics within the revealed scenarios during the noted period, the analytical function of trend extension is used. R² stands for the level of statistical reliability (plausibility) of the visualized data. The lines of the graph represent how the number of construction-based utterances varied within the appointed time periods and what changes in their distribution might be expected. Each line corresponds to one of the scenarios listed in the legend on the right.

The function allows us to trace the frequency of specific construction instantiations and predict which scenarios of ditransitive construction are more likely to be widely used in the future or, on the contrary, which scenarios might drop out of the language:


Fig. 2. Summary statistics of ditransitive construction extension trend


The outcomes of the corpus-based and statistical analysis of the representativeness and meaning incrementation of ditransitive construction in the Early English period show that the nonpreposition ditransitives with the semantics of material object transfer reached their tipping point (R² = 0.908 ≈ 91%) only in the middle of the 17th century.  During the decades prior to this period construction with preposition to seemed to be the only means of objectivizing this typical situation in the English language. Whereas the transfer of information has been described by the ditransitive noprepositional construction since 1470, and the recurrence of these speech units tended to increase (R² = 0.9141 ≈ 91%).

The construction-based utterances with the meaning of transfer in the future were occasionally used in the language until the beginning of the 17th century (R² = 0.7727 ≈ 77 %). The scenario of providing somebody with an opportunity went into the language at the beginning of the 16th century, and at the end of the 17th the number of such speech units became the largest in comparison with the number of other construction instantiations within the scenario of BENEFICIAL ACTIVITY  (R² = 0.8933 ≈ 90%).

Utterances of the CAUSATION semantic class (this applies to both scenarios – change of state and problem solution) were far from being frequently used by native English speakers. The growth of their representativeness is observed only at the end of the 17th century (Change of State scenario - R² = 0.9275 ≈ 93%; Problem Solution scenario - R² = 0, 8082 ≈ 81%).

BENEFICIAL ACTIVITY construction instantiations within the Material Object Creation / Obtaining scenario were used only occasionally in the period under review (R² = 0.1718 ≈ 17%), as long as speech units objectivizing the service delivery scenario (Favor) were not identified at all. Taking into account the results of empirical data corpus-based analysis, we can conclude that utterances with the semantics of rendering a service to someone appeared in English after the 17th century.

The same conclusion can be made in regard to speech units from the SCHEMATIC INTERACTION semantic class – they were barely represented in the language during the considered period of time (Physical Contact scenario - R² = 0.341 ≈ 34%; Non-verbal Communication scenario - R² = 0.3321 ≈ 33 %).

For the collexeme analysis of ditransitive construction verbalization in the Early Modern English period we selected the utterances  with  the syntactic structure [Subj (Subject Pronoun) + V (Past Tense) + Obj (Object Pronoun) + Obj2 (Noun Phrase)], [Subj (Noun Phrase) + V (Past Tense) + Obj (Object Pronoun) + Obj2 (Noun Phrase)] and [Subj (Noun Phrase) + V (Past Tense) + Obj (Noun Phrase) + Obj2 (Noun Phrase)]. The study revealed that between 1470 and 1690 the verbs give, bring, tell, and offer had the highest consistency index. Nominations of humans were also among the nouns most frequently "attracted" to the subject position and associated with agent argument. As a rule, these were the nouns signifying people who had power or had the gift of creativity: man, author, savior, father, prophet, poet, king. It can also be the designations of higher powers or abstractions associated with them: God, occasion, Christ, opportunity, Scripture, angel, lord, spirit, time. The recipient's position was also most often filled with lexical items denoting a person: men, God, Christ, people. Thematic argument, as a rule, was objectivized by lexical units with the nouns which have abstract semantics and signify such phenomena as liberty, truth, power, reverence. The nouns designating physical objects (especially food (bread, meat), drinks (water, beer, wine), various types of assets (money, land, cattle) could also fill in the slot of the thematic argument.  Among the thematic argument verbalizers there have been marked the lexical units specifying information (promises, thanks, words, tales, tidings).

The synchronic corpus-based analysis of ditransitive construction was carried out on the empirical base of 11316 speech units, generated on the syntactic templates: [Subj (Subject Pronoun) + V (Past Tense) + Obj (Object Pronoun) + Obj2 (Noun Phrase)], [Subj (Noun Phrase) + V (Past Tense) + Obj (Object Pronoun) + Obj2 (Noun Phrase)] and [Subj (Noun Phrase) + V (Past Tense) + Obj (Noun Phrase) + Obj2 (Noun Phrase)]. The empirical data was obtained from The Corpus of Contemporary American English (COCA), The TIME Magazine Corpus (TIME) and The British National Corpus (BNC)). The most common variants form this sample in sentences with verbs in the Past Simple Tense form (gave, brought, handed, offered, passed, made, poured, won, left, told, promised).

So, the most frequently used ditransitive construction speech units are currently the utterances from the OBJECT TRANSFER category – they are equal to 31% of overall quantity of selected utterances. Communicative interaction (information transfer) is objectivized by 26% of speech units. The least frequent units which belong to the category under analysis are the utterances objectivizing the FUTURE TRANSFER scenarios – they make only 2% out of the total quantity of the examples. BENEFICIAL ACTIVITY event verbalizers happen to be the most frequent when they objectivize the scenario of creating conditions (24% of the total number of examples), then comes the creation of a material object for the recipient (3.5%), favour scenarios are exceedingly few in number – only 0.5%. In the CAUSATION semantic class, the most common is the subgroup of expressions describing the change in the physical and emotional state of the recipient (11%), the scenario of the recipient's motivation to overcome difficulties is 2% from the total amount of the empirical data. The SCHEMATIC INTERACTION semantic class is the least representative and makes up only 1% of the corpus obtained data.

For the collexeme analysis of ditransitive construction verbalization from the synchronic perspective we took the utterances with the syntactic structure [Subj (Subject Pronoun) + V (Past Tense) + Obj (Object Pronoun) + Obj2 (Noun Phrase)], [Subj (Noun Phrase) + V (Past Tense) + Obj (Object Pronoun) + Obj2 (Noun Phrase)] and [Subj (Noun Phrase) + V (Past Tense) + Obj (Noun Phrase) + Obj2 (Noun Phrase)].  The carried out research shows that the verbs give, tell, bring and offer have the highest constancy index in the construction. The agent argument of the ditransitive construction based utterances from almost all semantic classes is objectivized via lexical units with conceptual parameter [human being] (people, man), most of them with the salient component of social status – professional (critic, waiter, waitress, teacher) and marital/family (mother, father, parent, wife). Among the objectivizers of construction agent argument there have also been found the nouns which are metonymic representations of people (community, company, organization, senate, government, etc.). In Modern English, there are examples of a ditransitive construction with abstract words filling the slot of agent argument (life, death, charm, chance, incident, law, source, etc.). The recipient argument role in most cases is objectivized by the human being nominations (student (s), people, children, visitors, kids, readers, patients, customers, viewers, etc.). The thematic argument, which correlates with the transferred object, can be verbalized by quite a wide range of abstract nouns standing for emotional aspects of human life (hope, joy, peace, comfort, protection, etc.), personal qualities (strength, hospitality, confidence, encouragement, etc.) and ontological abstractions (time, fame, disgrace, insights, etc.)  and those which designate physical objects of some value (money, land, food, fruit, water, coffee, tea, etc.).

It should also be noted that some verbs go only with nouns which designate physical objects, while others collocate mainly with abstract lexical units. So, for example, artifacts (glasses, pen, photographs, handkerchief, money, things, paper, books, tissues, etc.) are the objects of transfer in the ditransitive construction with the verbs pass and hand: She passed him sugar and cream (COCA); He laughed, as if amazed. As they turned on 76th Street, she handed him money, told him to demand a receipt, and kissed his cheekwhich was salty (COCA). Whereas for the verb win, the most recurrent collocations are with abstract nouns related to attractive aspects of social interaction (approval, friendship, play, popularity, respect,praise, fans, friends, invitation, election, etc.): Irish Catholic candidate that was the focus of attention, it was his enormous popularity and well-financed campaign that won him the election (COCA).


The results of the research presented in the paper can be generalized as a number of conclusions. Firstly, the technologies of corpus linguistics, combined with the data obtained as a result of the cognitive-semantic analysis of speech units with English ditransitive construction, made it possible to verify the dynamics of its development and meaning incrementation from the 1470s up to the present moment.

Secondly, the employment of corpus linguistics toolkit provided conditions for tracking the stages when construction semantics extended and new meanings emerged.

Thirdly, corpus-based research laid the groundwork for empirically substantiated categorization of ditransitive construction semantic representations with regard to their conceptual structure. 

Finally, the results of the collexeme analysis can be used in building the system of rules and restrictions which facilitate the selection of most regular and adequate verbalizers of a ditransitive construction within the framework of its syntactic and argument structure.


