Context-Based Rules for Grammatical Disambiguation in the Tatar Language

Gilmullin R.; Khakimov B.; Gataullin R.; Suleymanov D.

Context-Based Rules for Grammatical Disambiguation in the Tatar Language

Gataullin R.; Khakimov B.; Suleymanov D.; Gilmullin R.

URI: http://dspace.kpfu.ru/xmlui/handle/net/129652

Дата: 2017

Аннотации:

© 2017, Springer International Publishing AG. The paper is dedicated to the problem of grammatical ambiguity in the Tatar National Corpus and describes the methodology and software used for automation of the disambiguation process. Grammatical ambiguity is widely represented in agglutinative languages like Turkic or Finno-Ugric. Disambiguation in the corpus is based on the context-oriented classification of ambiguity types which has been carried out on corpus data in the Tatar language for the first time. In this study the corpus is used as a source for the research and at the same time as a destination for implementing the results. The grammatical ambiguity types are detected automatically using the finite-state morphological analyzer and then classified. In order to build up the grammatically disambiguated subcorpus, a special software module was developed. It searches for ambiguous tokens in the corpus, collects statistical information and allows creating and implementing the formal context-based disambiguation rules for different ambiguity types.

Показать полную информацию

Файлы в этом документе

Имя: SCOPUS03029743-20 ...

Размер: 52.22Kb

Формат: PDF

Открыть

Данный элемент включен в следующие коллекции

Публикации сотрудников КФУ Scopus [24551]
Коллекция содержит публикации сотрудников Казанского федерального (до 2010 года Казанского государственного) университета, проиндексированные в БД Scopus, начиная с 1970г.