Электронный архив

Building the tatar-Russian NMT system based on re-translation of multilingual data

Показать сокращенную информацию

dc.date.accessioned 2019-01-22T20:36:55Z
dc.date.available 2019-01-22T20:36:55Z
dc.date.issued 2018
dc.identifier.issn 0302-9743
dc.identifier.uri https://dspace.kpfu.ru/xmlui/handle/net/147978
dc.description.abstract © Springer Nature Switzerland AG 2018. This paper assesses the possibility of combining the rule-based and the neural network approaches to the construction of the machine translation system for the Tatar-Russian language pair. We propose a rule-based system that allows using parallel data of a group of 6 Turkic languages (Tatar, Kazakh, Kyrgyz, Crimean-Tatar, Uzbek, Turkish) and the Russian language to overcome the problem of limited Tatar-Russian data. We incorporated modern approaches for data augmentation, neural networks training and linguistically motivated rule-based methods. The main results of the work are the creation of the first neural Tatar-Russian translation system and the improvement of the translation quality in this language pair in terms of BLEU scores from 12 to 39 and from 17 to 45 for both translation directions (comparing to the existing translation system). Also the translation between any of the Tatar, Kazakh, Kyrgyz, Crimean Tatar, Uzbek, Turkish languages becomes possible, which allows to translate from all of these Turkic languages into Russian using Tatar as an intermediate language.
dc.relation.ispartofseries Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.subject Data augmentation
dc.subject Low-resourced language
dc.subject Neural machine translation
dc.subject Rule-based machine translation
dc.subject Turkic languages
dc.title Building the tatar-Russian NMT system based on re-translation of multilingual data
dc.type Conference Paper
dc.relation.ispartofseries-volume 11107 LNAI
dc.collection Публикации сотрудников КФУ
dc.relation.startpage 163
dc.source.id SCOPUS03029743-2018-11107-SID85053921231


Файлы в этом документе

Данный элемент включен в следующие коллекции

  • Публикации сотрудников КФУ Scopus [24551]
    Коллекция содержит публикации сотрудников Казанского федерального (до 2010 года Казанского государственного) университета, проиндексированные в БД Scopus, начиная с 1970г.

Показать сокращенную информацию

Поиск в электронном архиве


Расширенный поиск

Просмотр

Моя учетная запись

Статистика