Могут ли жесты передавать информацию о времени описываемых событий? Восприятие темпоральных жестов робота-компаньона
Aннотация
В этой статье мы исследуем, могут ли темпоральные жесты, указывающие на различные события во времени, передавать информацию об относительном времени этих событий. Мы исходим из представления, что во многих культурах время метафорически представлено в виде оси, ориентированной слева направо, т.е. недавние или будущие события имеют крайнее правое положение по сравнению с событиями прошлого. В нашем исследовании, мы проверяем гипотезу о том, что темпоральные жесты могут сообщать адресату информацию о времени описываемых событий. Мы провели эксперимент (N = 22, средний возраст 21,6 года, 17 женщин), в котором два робота-компаньона рассказывали разные истории. Каждая история включала события двух типов – настоящее и прошлое. Каждое утверждение сопровождалось (а) левосторонним или симметричным жестом (первое условие) или (б) правосторонним или симметричным жестом (второе условие). Задача испытуемых состояла в том, чтобы соотнести каждое событие со временем – определить является событие недавним или давним. В результате этого исследования гипотеза о том, что ориентация временных жестов может сообщать о времени обозначенных событий, была частично подтверждена. Результаты соответствуют концепции времени, ориентированной для рассказчика слева направо, где события прошлого находятся в центре, а более поздние события (недавние события, или ‘сегодня’) расположены справа. Эта концептуальная ориентация в эксперименте была интерпретирована слушателями-людьми, хотя, с их точки зрения, “недавние события” расположены по левую сторону метафорической оси.
Ключевые слова: Мультимодальная коммуникация, Темпоральные жесты, Человека-машинное взаимодействие, Робот компаньон, Распределение событий во времени
К сожалению, текст статьи доступен только на Английском
Introduction
In our study we examine the possibility of temporal gestures to communicate the time of the described events. While deictic gestures can be directed towards concrete objects in the environment and thus signify these objects, temporal gestures iconically represent some abstract events in the environment. Temporal gestures represent an interesting domain in the natural language system of reference to the real and imaginary objects (Kibrik, 2011), and may also be considered as the symbolic embodiment (Barsalou, 1999, 2005) of time domain. A person may nonverbally refer to some current events as located in front of him/her, and represent past events as located behind him/her – in this case, sagittal axis is exploited. In this study we examine temporal gestures, located on the lateral axis, where past events are located relatively to the left, and current or future events are located relatively to the right. In our study of gestures on the lateral axis, we intend to investigate whether the relative position of a gesture can serve as a marker in a natural semiotic system, conveying the information of event time to the addressee. Although certain tendencies can be observed in narrator’s behavior (e.g., he may point to future events as being located to the right), it cannot be claimed that this tendency is taken as a signifier by the addressee: that is, that the addressee understands the speaker’s right-directed gestures as an indication to future events. Besides, as speakers look at each other face to face, their temporal axes are inverted: the speaker’s right-directed gestures will seem to the addressee as pointing to the left.
We test the hypothesis that in the semiotic system of temporal gestures on the lateral axis there exists an opposition between right oriented gestures and left oriented gestures (relative to each other, not necessarily relative to the human body), where the right-oriented gestures indicate a later point in time than relatively left-oriented gestures. Thus, our hypothesis does not solely correspond to the opposition between past and future. We consider the assumption that if some events in a story temporally relate to each other (one event happened earlier than the other), then the use of temporal gestures on the lateral axis allows the speaker to understand the relative position of the events in time.
For the verification of the hypothesis, we organize an experiment, where each participant interacts with two companion robots: the robot tells stories, accompanying them with different kinds of temporal gestures. This procedure allows us to simulate the interaction in real environment, and exactly replicate the storytelling for each participant, that is important for the analysis of deictic gestures. This procedure also allows to combine a story with different modes of temporal gesticulation: the robot will tell a given story with one or the other orientation of temporal gestures. These conditions cannot be satisfied in real human interaction, so the robot companion is preferred, although it may lack some features of natural communication.
Temporal gesturing
The nature of temporal gestures, to be examined in the experiment, is very significant for the linguistic theory of nonverbal communication. Individual movements in communication can be ranged from pure physiological actions, like breathing and licking parched lips, to the intentional signs, like deicticgestures and licking lips to express the meaning ‘it is delicious’. Hence, it is important to evaluate, if temporal gestures constitute the intentional signs, able to communicate the information, to pure adaptive movements, accompanying speech. According to the classification by Grishina (Grishina et al., 2012; Grishina 2012), gestures are characterized by: (a) the presence of accompanying speech, (b) the presence of meaning, (c) complement to the utterance meaning, expression of the utterance structure. According to Kendon’s classification (Kendon, 1980), gestures are defined according to the three main parameters:
(a) necessity of accompanying speech – whether gestures can or cannot be reproduced without speech;
(b) the presence of structural linguistic features;
(c) the degree of regularity – whether a gesture is reproduced on a regular basis or created by the speaker ad hoc for a particular occasion.
Based on A. Kendon’s classification, David McNeill (McNeill, 1992) arranged gestures of different types along theKendon’s Continuum:
Gestures → Gestures replacing speech → Pantomimes → Emblems → Gesture languages
On this scale, from left to right: (a) the necessity of accompanying speech gradually decreases, (b) linguistic features (i.e. structural and arbitrary relations between signifier and signified) gradually increase, and (c) social regulation of the gesture performance gradually increases.
Temporal gestures on the lateral axis always accompany speech and cannot be used as a separate indication of time, but it is unclear whether they can add a temporal component of meaning to an utterance, like yesterday, long ago, now, etc. Temporal gestures can be generally characterized as deictic, and while spatial deictic gestures indicate the position of an object in the environment (or the position of an imaginary object in “discourse space” in front of the speaker), temporal deictic gestures rely on a fundamental cognitive metaphor TIME IS SPACE (Lakoff and Johnson, 2003) and map a moment in time into the metaphoric spatial coordinates in speaker’s environment. Although temporal gestures may reflect the internal metaphoric representation of time by the speaker, it should be tested, if they can communicate this representation to the listener – i.e. if the opposition of present, past and future not only shapes temporal gestures, but also can communicate the time of the described events.
Deictic systems are generally divided into absolute and egocentric systems. Absolute deictic orientation is observed in Australian Aboriginal languages, where deictics, both gestural and lexical, are oriented in relation to the environment, rather than to the position of the speaker (Haviland, 1993). In our study, we consider the egocentric orientation of deixis, referenced in relation to the speaker’s body. According to numerous studies (Cooperrider and Núñez, 2007; Núñez and Sweetser, 2006; Casasanto and Bottini, 2010), temporal statements indicating the past (long ago, yesterday), are accompanied by deictic gestures directed backwards or to the left, statements indicating the future (tomorrow, next week) usually direct gestures forward or to the right, while statements indicating the present (at the moment, now) direct gestures downwards, to the speaker’s feet. Deictic gestures in the cognitive science are considered to be a reliable source about the structure of time representation in a particular culture (Fuhrman and Boroditsky, 2007; Ouellet et al., 2010). At the same time, if the arrangement of time events on the sagittal axis is quite reliable (the future is usually in front, the past is behind), the arrangement of events on the lateral axis is considered as less definitive.
In the English culture, a person looks behind when recalling negative events from the past and looks ahead when planning the future (Lakoff and Johnson, 2003). It has been shown that people not only talk about time referring to the sagittal axis, but also tend to think about the time in a similar way, i.e. the past is mentally behind and the future – ahead (Boroditsky, 2000; Miles et al., 2010; Ulrich et al., 2012). This particular conceptualization is related to the metaphor of the walking man: the path he has traveled is the past and the place he is going to is the future (Clark, 1973). At the same time, this arrangement is not characteristic to all the languages. For example, in the Aymara language, spoken by the aborigines of South America, temporal gesturing places the future behind, and the past – in front of the speaker (Núñez and Sweetser, 2006). This system is explained by a metaphor “to know is to see” or “the man sits motionless”. In this scheme, the future (the unknown) is located behind the person’s back, and is not visible; while the past is in front – it is clear and well known. Some evidences suggest that Chinese can conceptualize the temporal domain in the same way (Gu et al., 2019), although modern Mandarin speakers adopt the front-to-the-future orientation (Xiao et al., 2018).
Another approach to the metaphoric time representation – the arrangement of the events on the lateral axis: in front of the speaker from left to right. A common idea is that this lateral orientation corresponds to the direction of writing in a given culture. The influence of writing direction on the distribution of events along the lateral axis has been reported experimentally, besides, the events are arranged by a subject on the lateral axis in relation to each other, not taking into account the position of the subject (Boroditsky et al., 2011; Fuhrman and Boroditsky, 2007; Fuhrman and Boroditsky, 2010). That is, relative time is indicated on the lateral axis as two or more sequential events. Since reading from left to right is characteristic of European cultures, on the left are earlier events and on the right is what happened later.
A number of psycholinguistic experiments have demonstrated the presence of a vertical temporal axis in Chinese, where earlier events are located at the top and later events – at the bottom. This orientation is also apparently related to the traditional direction of writing – from top to bottom – and can be traced both in lexical metaphors and temporal gesticulation (Boroditsky et al., 2011; Casasanto and Bottini, 2010; Clark, 1973).
Quite a compound structure of temporal gestures is observed in the Russian language culture. Grishina analyzed the trajectories of gestures accompanying statements with the designation of time. When an informant refers to the future, gestures are usually directed upward along the vertical axis and to the left along the transversal axis. If the informant uses the removed affirmativeness, then, as a rule, he points forward along the sagittal axis and to the right along the lateral axis. For the present continuous, historical and usual tense, the situation is similar in the sagittal axis: the informant points forward, while on the lateral axis gestures are usually directed to the right, except for the present historical. When the informant refers to the past, usual or perfect tense, he/she points forward along the sagittal axis and to the left along the lateral axis (Grishina, 2012; 2018). These data demonstrate the following patterns:
a) the usual tenses, present and past, are usually placed behind the speaker; for these tenses it is not important to link the event to a precise moment, but merely to underline the fact of the event;
b) future tense and the present historical tense are placed in front of the speaker;
c) future tense is also located on the vertical axis, as well as present continuous and perfect;
d) on the lateral axis, the past and the future are not opposed at all, as it was hypothesized earlier, the main opposition is between usual past and utterances with the removed affirmativeness.
Grishina also describes “sequences”: gestures accompanying statements with relative time: this can be a statement with two gestures, where the first gesture accompanies an event, and the second corresponds to the following event. According to the statistical results, preference is given to the left-to-right direction, that corresponds to our main hypothesis.
Human-robot interaction
As part of the study, we experimentally evaluated whether temporal gestures on the lateral axis can transmit information about the correlation of the described events to the addressee. To do that, it is required that (a) the subject is located in the same space with the speaker, and (b) that the gestures of the speaker are similar for the interactions with all subjects. The solution to this problem is an experiment with companion robots. Firstly, the robot, unlike a video image, is in the same space with a person – where it is a real space, not a virtual environment. Secondly, the robot is able to accurately reproduce the story many times with the same gestures when interacting with different subjects. When using two robots in an experiment (as in our case), the difference between their gestures can be precisely controlled by the robot movement protocol.
It is worth noting that the perception of temporal gestures of robots on the lateral axis has not yet been comprehensively studied in experiments. Using a companion robot as a research tool is also a new approach in studies of natural communication. If a robot can transmit information about the temporal correlation of events using gestures, then this can become a promising function for enriching human-machine interaction.
According to recent studies, an anthropomorphic companion robot can be perceived by interlocutors as a full-fledged communication partner that is a significant advantage of robots compared to virtual dialog agents. When people interact with each other face to face, they combine speech, gestures and gaze direction. The advantage of companion robots is that they can support complex multimodal interaction, that can increase human involvement in communication. According to some evidence, it is even impossible to achieve involvement in the process of communication with a robot without gestures and the control of gaze (Sidner et al., 2004). Gestures of engagement greatly affect the behavior of people interacting with robots in situations of communication and cooperation (Nagai et al., 2003). People are more likely to direct their attention to a robot using gestures and find such interaction more productive (Zinina, et al., 2022). As gesticulating abilities of robots become more advanced, human-robot interaction will be perceived as more reliable and will allow robots to be deeply included in people’s daily lives. An effective combination of speech with nonverbal means can become a significant advantage of anthropomorphic robots compared to other types of interfaces.
Analysis of temporal gestures in the corpus
In order to design patterns of robot behavior in an experiment, it is necessary to study the features of temporal gestures in the corpus. We studied nonverbal behavior of people in a multimodal REC corpus (Kotov and Zinina, 2015). We analyzed cases, where participants showed some oriented gestures, while also referencing the events in time. This analysis was performed in order to design the behavioral patterns for the robot, the goal of statistical analysis of gestures was not set. In the following subset of examples, the informant (on the left) talks about dance experience, addressing multiple events in time: the girl recalls stories from the past, accompanies them with the cases from her present life, and talks about future plans. The performed annotation of this video allows us to associate hand positions and speech descriptions of events. The examples are represented in Table 1.
As we see from the cases in the corpus, temporal gestures may be mixed with spatial or conceptual representation, where the person describes his/her path from left to right, with earlier events and locations located on the left, and later events and locations – on the right. Only in cases of possible contradictions between these conceptualizations (like and then I return, where the event time is later, but the location is earlier on the path), one can assert, that the temporal orientation prevails over the spatial one. The examples suggest, that spatial movement should not be included in the experiment narratives, as they can easily be confused. The possibility of lateral gestures to communicate the relative time of events to the addressee, as observed in the corpus, can be tested in the following experiment.
Experimental procedure
The experiment should test the hypothesis that temporal gestures on the lateral axis allow the listener to understand the relative time of the described events. In each story the speaker will describe two situations corresponding to different times, with the events of the two situations being intermixed. We suggest that without temporal gestures, a semiotic system can be formed in which the opposition of gestures in some linguistic positions forces the speaker to reliably attribute an event to one of the situations based on the direction of the gesture. If temporal gestures do not constitute a sign system, the addressee will confuse the events, attributing a statement randomly to one of the two situations. Twenty-two participants (mean age 21.6 years, 17 female) took part in the experiment. The native language of the participants was Russian, and the experiment was conducted in Russian. Only right-handed people participated in the experiment. The robot was pronouncing the text with the help of Yandex speech API – state of the art speech synthesis solution. The synthesized texts were checked in order to correct the speech synthesis mistakes, like a wrong position of a stress.
Two F-2 companion robots were used in the experiment; these robots are designed for human-machine interaction research (Zinina, et al., 2022). Each of the robots presented stories to a person while using temporal gesturing. One robot, labeled with a triangle, accompanied the events of the story with left-directional (relative to the robot) and symmetrical gestures. The other robot, labeled with a square, used right-directional and symmetrical gestures. For left-directional gestures, the robot with a “triangle” pointed to the left with its left hand, turned its head to the left, and rolled its eyes to the left to simulate looking to the left (Figure 1(a)). For the head and eyes we used gestures, prototypical for a situation, where the robot looks at some referent on the right or on the left; for the hand we used gestures, typical for pointing out some referent on the right or left (Kotov and Zinina, 2015). As shown earlier, such gestures have a significant effect on the actions of the addressee: in a prior experiment these gestures forced the addressee to choose a real object (game piece) located on the left or on the right, even without the awareness of the person (Zinina et al., 2019). Several packages in the BML format (Kopp et al., 2006; Vilhjálmsson et al., 2007) were developed to model the robot’s gestures of the three types in order to diversify its performance (see examples in rows on Figure 1), so that the same pattern is not repeated in a row. Each right-directional gesture was perfectly symmetrical to the corresponding left-directional gesture – Figure 1(a-c).
The use of two robots allows us to test the initial hypothesis: if for each robot the gesture pointing to the farther right point causes the addressee to interpret the event as later, then the position on the temporal axis is universal: the “more right gesture” points to a later event, regardless of the position of the gesture relative to the speaker’s body. If, on the contrary, only specific gestures indicate a later or earlier time of the event (e.g., pointing to the right of one’s body indicates a later time of the event), it means that only specific gestures in the nonverbal semiotic system indicate the time of events.
Two stories were designed for the experiment: How I passed my exams and How my car got broken. Each story contained descriptions of two situations. As the situations must be described in the same grammatical tense, we decided to oppose a recent past (today) and a distant past (a few months or a few years ago). This corresponds to our hypothesis, in which we compare the relatedness of situations rather than the opposition of past and future. At the beginning of each story, the two situations were explicitly announced by the robot: (1a) I am currently taking an exam session at the university and (1b) I took an exam in the 11th grade, (2a) my car did not start today and (2b) the first time I tried to drive a car, it did not start. At the stage of announcing the situations, the robot did not use gestures, it only slightly moved its arms symmetrically to mimic breathing. The absence of gestures at this stage permits us to avoid the conventional arrangement of events in discourse space. Each story contained three pairs of statements, where a sample pair could be my starter fuse blew and it’s bad when the battery runs out, and that’s what exactly happened. In each pair, one of the statements was accompanied by a symmetrical gesture and the other statement was accompanied by a directional gesture (left-directed or right-directed – depending on the robot). The type of gesture for utterances in each pair was selected randomly, assuming that the utterances in each pair get different types of gestures. The selected sequence of gesture types was similar for the two stories. Lexical time markers were absent in the statements.
The experiment followed a within-subject design: each participant in the experiment listened to one robot’s story and then to the other robot’s story. The order of the conditions for a participant was randomized. At the beginning of the experiment, a subject was randomly directed to one of the robots and listened to the story How I passed my exams. Thus, half of the subjects listened to this story from the robot with a triangle (it accompanied a part of the statements with left-directional gestures and a part of the statements with symmetrical gestures). The other half of the subjects listened to this story from the robot with a square (it accompanied a part of the statements with right-directional gestures and a part of the statements with symmetrical gestures). Next, each subject filled out a questionnaire and moved to the other robot to listen to the story How my car got broken. The robot with a triangle also accompanied this story with left-directional and symmetrical gestures, while the robot with a square – with right-directional and symmetrical gestures. After the second story, subjects filled out a final questionnaire (Figure 2).
Thus, during the experiment, each subject listened to two stories – in each of the stories the narrator (robot) used different oppositions of gestures, accompanying some statements with a symmetric gesture and some statements with a directional gesture: one robot directed gestures to the left and the other to the right.
Results
After listening to the story, the subjects were asked to indicate to which time in the story each statement belonged, e.g. ‘the starter fuse blew’ – did it happen today or a long time ago? Table 2 shows the subjects’ estimates of the time of the event depending on the accompanying gesture.
Table 2. Generalized results on the attribution of a statement to time depending on the accompanying gesture
* color indicates significant differences by Chi Square:
- for symmetrical robot gestures with a triangle Chi Square = 7,529412, p = 0,006070;
- for right-oriented robot gestures with a square Chi Square = 3.595745 p = 0,057929.
Таблица 2. Обобщенные результаты по отнесению события рассказа ко времени в зависимости от сопровождающего жеста
* цветом обозначены статистически значимые различия по Chi Square:
- для симметричных жестов робота с треугольником Chi Square = 7,529412, p = 0,006070;
- для правоориентированных жестов робота с квадратом Chi Square = 3,595745 p = 0,057929.
Table 2 shows that for the robot with a triangle, which used left-oriented and symmetrical gestures, subjects tend to interpret symmetrical gestures as referring to the distant past – in 74% of cases. However, there was no clear trend in the interpretation of the left-oriented gestures of this robot.
While interacting with the robot with a square, which used right-oriented and symmetrical gestures, subjects relate the events with right-directed gestures to the present – in 64% of cases. No clear trend was observed in the interpretation of the symmetrical gestures of this robot. Despite the results obtained, the available data require further clarification, since the data for the right-oriented robot is close to the zone of insignificance.
In the post-experimental survey, most of the participants (72,2%) indicated that they indeed use the metaphor of lateral time orientation, and consider past events as being rather on the left and later/future events as being relatively on the right.
Discussion
The results show that when right-directed gestures are contrasted with symmetrical gestures in the narrator's behaviour, right-directed gestures indicate a later point in time. In the present experiment, this was also an event closer to the moment of speech. This observation partially confirms the original hypothesis, that the later events are located on the right, however, in this gesture opposition the earlier events have no tendency to be located on the left. The addressee correctly interprets pointing to the right as a reference to a later event (the recent past), despite the fact that from his/her point of view the speaker points to the left. Thus, the listener adjusts to the speaker’s temporal axis rather than relying on his/her own temporal axis.
Left-pointing gestures do not get a definitive temporal interpretation in a situation where they are contrasted with symmetrical gestures. After the experiment, some participants noted that this opposition was not very convenient for them. At the same time, in this opposition, symmetrical gestures are seen by the addressee as indicating the long past.
As two different oppositions were modelled on each robot, we can assume, that the rightmost position in each opposition is more significant, and can be considered as a “strong” linguistic position, where the distinction in meaning is more profound. Leftmost position in each opposition is not so strong and has no direct link to the indication of time – e.g. may be used by some iconic gestures, gestures, indicating topic change and others. The two strong rightmost positions combined, correspond to the idea that time for the storyteller is oriented from left to right, where past events are in the center and more recent events (the recent past, today) are relatively on the right – see Figure 3.
Figure 3. Time orientation based on statistically significant differences in respondents’ answers
Рисунок 3. Ориентация времени на основе статистически значимых различий в ответах респондентов
Following the results, the addressee, looking at the narrator, correctly interprets the location of the events on the time axis, adapting his/her perspective to the temporal lateral axis of the narrator. This can be considered as a rather compound task. It is quite evident, that the listener easily adapts to the “future-in front vs past-behind” system: when the speaker refers to the past as being behind his/her back, the listener adapts to this position, although the position “behind the speaker” is also in front of the listener. For the lateral representation, the task may be even more difficult, as the speaker has to adjust to left-to-right orientation, where the axis orientation of the listener contradicts to the axis of the speaker.
The results indicate, that lateral temporal gestures may be considered not solely as “adaptors” or “movements”, but as communication signs, accompanying utterances, having their own meaning and able to add this meaning to the meaning of utterance. This heightens the position of temporal deictic gestures in the semiotic system.
Although the discovered tendencies may apply to most languages with left-to-right writing systems, there does exist a possibility, that in some language cultures the lateral time axis is not sufficiently grounded in dialogue culture to communicate the time reference to the listener – e.g. if the “future-in front vs past-behind” system prevails and suppresses the lateral time expression in communication. So, while the findings can be universal for left-to-right languages, the existence of the lateral time expression system in each language culture should be tested separately.
Conclusion
This study partially confirmed the hypothesis that the orientation of temporal gestures in space can indicate the time of the denoted events. We found that the more recent event is located to the right (in the opposition between right-oriented and symmetric gestures), while a distant past event is more likely to be located at the center (in the opposition between left-oriented and symmetric gestures). The results correspond to the metaphoric time representation from left to right, and suggest that the listener correctly interprets the perspective of the speaker in lateral time representation. Temporal gestures may be considered as having own semantics and able to add this semantics to the utterance. However, the phenomenon requires further clarification.
The results of the study can be applied to enrich the gestural behavior of companion robots in situations, where the robots explain compound events, distributed in time. For example, in the educational domain a companion robot may apply temporal gestures when explaining Perfect group tenses, as well as when explaining the topic of Tense shift in languages. Scenarios with temporal gestures can be used by robot tutors who are engaged in instructing people and have to train workers to perform a certain sequence of actions.
Список литературы
Barsalou, L. W. (1999). Perceptual Symbol Systems, Behavioral and Brain Sciences, 22, 577–660. https://doi.org/10.1017/S0140525X99002149(In English)
Barsalou, L. W. (2005). Abstraction as dynamic interpretation in perceptual symbol systems, in Gershkoff-Stowe, L. and Rakison, D. (eds.), Building object categories, Carnegie Symposium Series, Erlbaum, 389–431. (In English)
Boroditsky, L. (2000). Metaphoric structuring: Understanding time through spatial metaphors, Cognition, 75 (1), 1–28. https://doi.org/10.1016/S0010-0277(99)00073-6(In English)
Boroditsky, L., Fuhrman, O. and McCormick, K. (2011). Do English and Mandarin speakers think about time differently?, Cognition, 118 (1), 123–129. https://doi.org/10.1016/j.cognition.2010.09.010(In English)
Casasanto, D. and Bottini, R. (2010). Can mirror-reading reverse the flow of time?, in Hölscher, C., Shipley, T. F., Olivetti Belardinelli, M., Bateman, J. A. and Newcombe, N. S. (eds.), Spatial Cognition VII. Spatial Cognition 2010. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 6222, 335–345. https://doi.org/10.1007/978-3-642-14749-4_28(In English)
Clark, H. H. (1973). Space, time, semantics, and the child, Cognitive development and acquisition of language, 27–63. https://doi.org/10.1016/B978-0-12-505850-6.50008-6(In English)
Cooperrider, K. and Núñez, R. (2007). Doing time: speech, gesture, and the conceptualization of time, Center for Research in Language Technical Reports, 19, 3–19. (In English)
Fuhrman, O. and Boroditsky, L. (2007). Mental time-lines follow writing direction: Comparing English and Hebrew speakers, Proceedings of the Annual Meeting of the Cognitive Science Society, 29 (29), 1007-1011. (In English)
Fuhrman, O. and Boroditsky, L. (2010). Cross-cultural differences in mental representations of time: Evidence from an implicit nonlinguistic task, Cognitive Science, 34 (8), 1430–1451. https://doi.org/10.1111/j.1551-6709.2010.01105.x(In English)
Grishina, E. A. (2012). Ukazaniya rukoj kak sistema (po dannym Mul'timedijnogo russkogo korpusa) [Hand indications as a system (on the data of the Multimedia Russian Corpus)], Voprosy Yazykoznaniya, 3, 3-50. (In Russian)
Grishina, E. A. (2018). Russkaya zhestikulyaciya s lingvisticheskoj tochki zreniya. Korpusnye issledovaniya [Russian gesticulation from the linguistic point of view, Corpus studies], Yazyki slavyanskoj kul'tury, Moscow, Russia. (In Russian)
Grishina, E., Savchuk, S. and Sichinava, D. (2012). Multimodal Parallel Russian Corpus (MultiPARC): Main Tasks and General Structure, LREC 2012 Workshop on Best Practices for Speech Corpora in Linguistic Research, Istanbul, Turkey, 13–16. (In English)
Gu, Y., Zheng, Y. and Swerts, M. (2019). Which is in front of Chinese people, past or future? The effect of language and culture on temporal gestures and spatial conceptions of time, Cognitive Science, 43 (12), e12804. https://doi.org/10.1111/cogs.12804(In English)
Haviland, J. B. (1993). Anchoring, iconicity, and orientation in Guugu Yimithirr pointing gestures, Journal of Linguistic Anthropology, 3 (1), 3–45. https://doi.org/10.1525/jlin.1993.3.1.3(In English)
Kendon, A. (1980). Gesticulation and speech: Two aspects of the process of utterance, The Relationship of Verbal and Nonverbal Communication, 25 (1980), 207–227. (In English)
Kibrik, A. A. (2011). Reference in Discourse, Oxford University Press, UK. (In English)
Kopp, S., Krenn, B., Marsella, S., Marshall, A., Pelachaud, C., Pirker, H., Thórisson, K. and Vilhjálmsson, H. (2006). Towards a Common Framework for Multimodal Generation: The Behavior Markup Language, in Gratch, J., Young, M., Aylett, R., Ballin, D. and Olivier, P. (eds.), Intelligent Virtual Agents, 4133, Springer, Berlin, Heidelberg, 205–217. http://dx.doi.org/10.1007/11821830_17(In English)
Kotov, A. A. and Zinina, A. A. (2015). Funkcional'nyj analiz neverbal'nogogo kommunikativnogo povedeniya [Functional Analysis of Non‑Verbal Communicative Behavior], Proceedings of the International ConferenceComputer linguistics and intellectual technologies, Moscow, Russia, 287–295. (In Russian)
Lakoff, G. and Johnson, M. (2003). Metaphors we live by, University of Chicago press Chicago, IL. (In English)
McNeill, D. (1992). Hand and mind: What gestures reveal about thought, University of Chicago press Chicago, IL. (In English)
Miles, L., Nind, L. and Macrae, C. (2010). Moving through time, Psychological Science, 21 (2), 222. (In English)
Nagai, Y., Hosoda, K., Morita, A. and Asada, M. (2003). A constructive model for the development of joint attention, Connection Science, 15 (4), 211–229. https://doi.org/10.1080/09540090310001655101(In English)
Núñez, R. E. and Sweetser, E. (2006). With the future behind them: Convergent evidence from Aymara language and gesture in the crosslinguistic comparison of spatial construals of time, Cognitive Science, 30 (3), 401–450. https://doi.org/10.1207/s15516709cog0000_62(In English)
Ouellet, M., Santiago, J., Israeli, Z. and Gabay, S. (2010). Is the future the right time?, Experimental Psychology, 57 (4). https://doi.org/10.1027/1618-3169/a000036(In English)
Sidner, C. L., Kidd, C. D., Lee, C. and Lesh, N. (2004). Where to look: a study of human-robot engagement, Proceedings of the 9th International Conference on Intelligent User Interfaces, New York, NY, United States, 78–84. https://doi.org/10.1145/964442.964458(In English)
Ulrich, R., Eikmeier, V., de la Vega, I., Ruiz Fernández, S., Alex-Ruf, S. and Maienborn, C. (2012). With the past behind and the future ahead: Back-to-front representation of past and future sentences, Memory & Cognition, 40, 483–495. https://doi.org/10.3758/s13421-011-0162-4(In English)
Vilhjálmsson, H., Cantelmo, N., Cassell, J. E. Chafai, N., Kipp, M., Kopp, S., Mancini, M., Marsella, S., Marshall, A., Pelachaud, C., Ruttkay, Z., Thórisson, K., van Welbergen, H. and van der Werf, R. (2007). The Behavior Markup Language: Recent Developments and Challenges, Intelligent Virtual Agents, 99–111. http://dx.doi.org/10.1007/978-3-540-74997-4_10(In English)
Xiao, C., Zhao, M. and Chen, L. (2018). Both Earlier Times and the Future Are “Front”: The Distinction Between Time-and Ego-Reference-Points in Mandarin Speakers’ Temporal Representation, Cognitive Science, 42 (3), 1026–1040. https://doi.org/10.1111/cogs.12552(In English)
Zinina, A., Arinkin, N., Zaydelman, L. and Kotov, A. (2019). The role of oriented gestures during robot’s communication to a human, Proceedings of the International ConferenceComputer linguistics and intellectual technologies, Moscow, Russia, 800–808. (In English)
Zinina, A., Kotov, A., Arinkin, N. and Zaidelman, L. (2022). Learning a foreign language vocabulary with a companion robot, Cognitive Systems Research, 77, 110-114 https://doi.org/10.1016/j.cogsys.2022.10.007(In English)