Artificial vs Human Intelligence: A Case Study of Translating Jokes Based on Wordplay
Artificial intelligence (AI) technologies used in professional translation question the effectiveness of human-AI interaction. Deep learning can mimic human cognitive processes, accordingly suggesting that AI could reproduce the logic and mechanics of the source text in the target language. The study necessitates an objective assessment of the neural machine translation (NMT) naturalness, which will apply prompt engineering to optimize the translation process, save resources, and ensure the sustainable development of super-central and central natural languages of the world. The study employs English rhyming/non-rhyming pun-based jokes, and the corresponding Russian translations performed by both professional translators and by ChatGPT-4o, with the prompts for human and AI translators being the same. The results obtained were processed using linguistic and translation analysis followed by textometric and statistical analysis. To evaluate the humorous effect of the translated jokes and to identify signs of artificiality in these jokes, 150 informants were surveyed. The study established the degree of humorous effect and the naturalness criteria for the translated jokes. While the source text lacks terminology, specialized words and complex grammar, the AI-generated translations were perceived as complex due to literalisms and calques. Conversely, human translators prefer a holistic translation technique and are more flexible to interpret imagery and syntactic structures of jokes. This highlights a greater creative freedom of human translators, who avoid stereotypes and generate novel interpretations. In conclusion, the study measures the effectiveness of AI as an auxiliary tool for translating and assessing pun-based jokes.
Figures
Research Result. Theoretical and Applied Linguistics is included in the scientific database of the RINTs (license agreement No. 765-12/2014 dated 08.12.2014).
Журнал включен в перечень рецензируемых научных изданий, рекомендуемых ВАК
While nobody left any comments to this publication.
You can be first.
Bang, Y., Cahyawijaya, S., Lee, N., Dai, W., Su, D., Wilie, B., Lovenia, H., Ji, Z., Yu, T., Chung, W., Do, Q. V., Xu, Y. and Fung P. (2023). A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity, in Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 1, 675–718. https://doi.org/10.18653/v1/2023.ijcnlp-main.45 (In English)
Becker, J., Wahle, J. Ph., Gipp, B. and Ruas, T. (2024). Text generation: A systematic literature review of tasks, evaluation, and challenges, arXiv:2405.15604v1. https://doi.org/10.48550/arXiv.2405.15604 (In English)
Blinova, O. and Tarasov, N. (2022). A hybrid model of complexity estimation: Evidence from Russian legal texts, Frontiers in Artificial Intelligence, 5. https://doi.org/10.3389/frai.2022.1008530 (In English)
Çano, E. and Bojar, P. (2020). Human or machine: automating human likeliness evaluation of NLG Texts, arXiv:2006.03189v1. https://doi.org/10.48550/arXiv.2006.03189 (In English)
Celikyilmaz, A., Clark, E. and Gao, J. (2021). Evaluation of text generation: A survey, arXiv preprint arXiv: 2006.14799. https://doi.org/10.48550/arXiv.2006.14799 (In English)
Corizzo, R. and Leal-Arenas, S. (2024). One-GPT: A one-class deep fusion model for machine-generated text detection, 2023 IEEE International Conference on Big Data (BigData). https://doi.org/10.1109/BigData59044.2023.10386674 (In English)
Doughman, J., Afzal, O. M., Toyin, H. O., Shehata, Sh., Nakov, P. and Talat, Z. (2024). Exploring the limitations of detecting machine-generated text, arXiv:2406.11073v1. https://doi.org/10.48550/arXiv.2406.11073 (In English)
Etxaniz, J., Azkune, G., Soroa, A., Lopez de Lacalle, O. and Artetxe, M. (2023). Do multilingual language models think better in English?, arXiv:2308.01223. https://doi.org/10.48550/arXiv.2308.01223 (In English)
Fadaee, E. (2011). Translation naturalness in literary works: English to Persian, InternalJournal of English and Literature, 2 (9), 200–205.
(In English)Fraser, K. C., Dawkins, H. and Kiritchenko, S. (2024). Detecting AI-generated text: factors influencing detectability with current methods, arXiv:2406.15583v1. https://doi.org/10.48550/arXiv.2406.15583 (In English)
Fu, J., Ng, S.-K., Jiang, Zh. and Liu, P. (2023). GSTScore: Evaluate as you desire, arXiv preprint arXiv:2302.04166. https://doi.org/10.48550/arXiv.2302.04166 (In English)
Gkatzia, D. and Mahamood, S. (2015). A snapshot of NLG evaluation practices 2005–2014, in Proceedings of the 15th European Workshop on Natural Language Generation (ENLG), Brighton, September 2015, 57–60, arXiv:2302.07459v2. https://doi.org/https://doi.org/10.18653/v1/W15-4708 (In English)
Goddard, J., Celik, Y., Goel, S. (2024). Beyond the human eye: Comprehensive approaches to AI text detection, 18th Annual Symposium on Information Assurance (ASIA'23), 53. (In English)
Gryka, P., Gradoń, K., Kozłowski, M., Kutyła, V. and Janicki, A. (2024). Detection of AI-Generated Emails – A Case Study, In The 19th International Conference on Availability, Reliability and Security (ARES 2024), July 30 – August 02, 2024, Vienna, Austria. ACM, New York, NY, USA. https://doi.org/10.1145/3664476.3670465 (In English)
He, Zh., Liang, T., Jiao, W., Zhang, Zh., Yang, Y., Wang, R., Tu, Zh., Shi, Sh. and Wang, X. (2023). Exploring humanlike translation strategy with large language models, arXiv:2305.04118. https://doi.org/10.48550/arXiv.2305.04118 (In English)
Hendy, A., Abdelrehim, M. G., Sharaf, A., Raunak, V., Gabr, M., Matsushita, H., Kim, Y. J., Afify, M. and Awadalla, H. H. (2023). How good are GPT models at machine translation? A comprehensive evaluation, arXiv.2302.09210. https://doi.org/10.48550/arXiv.2302.09210 (In English)
Hönig, H. G. (2020). Konstruktives Übersetzen, Stauffenburg Verlag, 3rd ed. (In German)
Jiao, W., Wang, W., Huang, J. T., Wang, X., Shi, Sh. and Tu, Zh. (2023). Is ChatGPT a good translator? Yes with GPT-4 as the engine, arXiv:2301.08745. https://doi.org/10.48550/arXiv.2301.08745 (In English)
Li, Ch., Wang, J., Zhang, Y., Zhu, K., Hou, W., Lian, J., Luo, F., Yang, Q. and Xie, X. (2023). Large language models understand and can be enhanced by emotional stimuli, arXiv:2307.11760. https://doi.org/10.48550/arXiv.2307.11760 (In English)
Mossop, B. (2000). The workplace procedures of professional translators, in Chesterman, A., San Salvador, N. G., and Gambier, Y. (eds.), Translation in context: selected papers from the EST Congress, Granada 1998, John Benjamins, Amsterdam, 39–48. https://doi.org/10.1075/btl.39.07mos (In English)
Nitzke, J., Hansen-Schirra, S. and Canfora, C. (2019). Risk management and post-editing competence, Journal of Specialised Translation, 31, 239–259 (In English)
Obeidat, A. M., Ayyad, G. R., Sepora, T. and Mahadi, T. (2020). The tension between naturalness and accuracy in translating lexical collocations in literary text, Journal of Social Sciences and Humanities, 17 (8), 123–134. (In English)
Oedingen, M., Engelhardt, R. C., Denz, R., Hammer, M., Konen, W. (2024). ChatGPT code detection: Techniques for uncovering the source of code, AI 2024, 5, 1066–1094. https://doi.org/10.3390/ai5030053 (In English)
Pan, W. H., Chok, M. J., Shan Wong, J. L., Shin, Y. X., Poon, Y. Sh., Yang, Zh., Chong, Ch. Y., Lo, D. and Lim, M. K. (2017). Assessing AI detectors in identifying AI-generated code: Implications for education, Conference’17, Washington, DC, USA, arXiv:2401.03676v1. https://doi.org/10.48550/arXiv.2401.03676 (In English)
Pfaff, C. W. (1979). Constraints on language mixing: Intrasentential code-switching and borrowing in Spanish/English, Language, 55 (2), 291–318. https://doi.org/10.2307/412586 (In English)
Puduppully, R., Kunchukuttan, A., Dabre, R., Aw, A. T. and Chen, N. F. (2023). Decomposed prompting for machine translation between related languages using large language models, arXiv:2305.13085. https://doi.org/10.48550/arXiv.2305.13085 (In English)
Rogers, M. (1999). Naturalness and translation, in: Rasmussen, W., Roald, J. & Simonnæs, I. (eds). SYNAPS 2. Bergen: NHH, 9–31. (In English)
Schuff, H., Vanderlyn, L., Adel, H. and Vu, Th. (2023). How to do human evaluation: A brief introduction to user studies in NLP, Natural Language Engineering, 29, 1–24. https://doi.org/10.1017/S1351324922000535 (In English)
Sellam, Th., Das, D. and Parikh, A. P. (2020). BLEURT: Learning robust metrics for text generation, arXiv:2004.04696. https://doi.org/10.48550/arXiv.2004.04696 (In English)
Shi, F., Suzgun, M., Freitag, M., Wang, X., Srivats, S., Vosoughi, S., Chung, H. W., Tay, Y., Ruder, S., Zhou, D., Das, D. and Wei, J. (2023). Language models are multilingual chain-of-thought reasoners, in The Eleventh International Conference on Learning Representations, arXiv:2210.03057. https://doi.org/10.48550/arXiv.2210.03057 (In English)
Vermeer, H. J. (1994). Translation today: Old and new problems, in Snell-Hornby, M., Pöchhacker, F. and Kaindl, K. (eds.), Translation Studies – An Interdiscipline, John Benjamins, Amsterdam/Philadelphia, 3–16. (In English)
Wang, Z. M., Peng, Zh., Que, H., Liu, J., Zhou, W., Wu, Y., Guo, H., Gan, R., Ni, Z., Zhang, M., Zhang, Zh., Ouyang, W., Xu, K., Chen, W., Fu, J. and Peng, J. (2023). Role LLM: Benchmarking, eliciting, and enhancing role-playing abilities of large language models, arXiv.2310.00746. https://doi.org/10.48550/arXiv.2310.00746 (In English)
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q. V. and Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models, in Advances in Neural Information Processing Systems, Curran Associates, Inc., 35, 24824–24837. https://doi.org/10.48550/arXiv.2201.11903 (In English)
Yang, Ch., Wang, X., Lu, Y., Liu, H., Le, Q. V., Zhou, D. and Chen, X. (2023). Large language models as optimizers, arXiv:2309.03409. https://doi.org/10.48550/arXiv.2309.03409 (In English)
Yang, X., Zhan, R., Wong, D. F., Wu, J. and Chao, L. S. (2023). Human-in-the-loop machine translation with large language model. In Proceedings of Machine Translation Summit XIX, Macau SAR, China, Machine Translation Summit, 2, 88–98. https://doi.org/10.48550/arXiv.2310.08908 (In English)
Zabalbeascoa, P. (2005). Humor and translation – an interdiscipline, available at: https://www.semanticscholar.org/paper/Humor-and-translation–an-interdiscipline-Zabalbeascoa/2cdcc20591c39f32bd5e2049534600515ede3f3f (Accessed 25 June 2024). https://doi.org/10.1515/humr.2005.18.2.185 (In English)
Zhang, Y., Ma, Y., Liu, J., Liu, X., Wang, X. and Lu, W. (2024). Detection Vs. Anti-detection: Is text generated by AI detectable?, iConference 2024, LNCS 14596, 209-222. https://doi.org/10.1007/978-3-031-57850-2_16 (In English)
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q. and Artzi, Y. (2020). BERTscore: Evaluating text generation with BERT, arXiv preprint arXiv: 1904.09675. https://doi.org/10.48550/arXiv.1904.09675 (In English)