Methods and software tools of morphological disambiguation in the texts in tatar

Gilmullin R.; Gataullin R.; Suleymanov D.

Methods and software tools of morphological disambiguation in the texts in tatar

Gataullin R.; Gilmullin R.; Suleymanov D.

URI: https://dspace.kpfu.ru/xmlui/handle/net/137559

Дата: 2015

Аннотации:

© Research India Publications. This article provides a review of analytical methods for resolving the problem of morphological ambiguity and analysis of their applicability to the Tatar language. Since the task was set still in the 50-60-ies of XX century, the methods of solution have been accumulated quite a lot. Basically they can be divided into methods of rule-based and statistical and probabilistic methods. Methods are mainly language independent, each has its advantages and disadvantages, and their accuracy varies from one language to another. For example, for the English language, which has a poor morphology and the fixed order of the words, the accuracy reaches 94-96%. And for the Russian language with free word order, such accuracy is difficult. To resolve the ambiguity in morphological Tatar language in terms of the characteristics of the language such as agglutinative feature and free word order, it is offered a fusion of these methods, by which a high precision resolution is supposed to be achieved. At the moment, the research is still in progress, the tools for the development of contextual rules have been designed, subcorpus for statistical machine learning and probabilistic models is also being elaborated. In addition to the methods, the article describes the current state of the electronic corpus of the Tatar language, and discusses the problems and possible solutions to the problem of polysemanticism in the corpus markings.

Показать полную информацию

Файлы в этом документе

Имя: SCOPUS09734562-20 ...

Размер: 44.28Kb

Формат: PDF

Открыть

Данный элемент включен в следующие коллекции

Публикации сотрудников КФУ Scopus [24551]
Коллекция содержит публикации сотрудников Казанского федерального (до 2010 года Казанского государственного) университета, проиндексированные в БД Scopus, начиная с 1970г.