Kazan Federal University Digital Repository

A comparative evaluation of statistical part-of-speech taggers for Russian

Show simple item record

dc.contributor.author Gareev R.
dc.contributor.author Ivanov V.
dc.date.accessioned 2018-09-18T20:27:35Z
dc.date.available 2018-09-18T20:27:35Z
dc.date.issued 2015
dc.identifier.issn 1865-0929
dc.identifier.uri https://dspace.kpfu.ru/xmlui/handle/net/140105
dc.description.abstract © Springer International Publishing Switzerland 2015. Part-of-speech (POS) tagging is an essential step in many text processing applications. Quite a few works focus on solving this task for Russian; their results are not directly comparable due to the lack of shared datasets and tools. We propose a POS tagging evaluation framework for Russian that comprises existing third-party resources available for researchers. We applied the framework to compare several implementations of statistical classifiers: HunPos, Stanford POS tagger, OpenNLP implementation of MaxEnt Markov Model, and our own reimplementation of Tiered Conditional Random Fields. The best tagger that was trained on a corpus with less than one million words achieved an accuracy above 93% .We expect that the evaluation framework will facilitate future studies and improvements on POS tagging for Russian.
dc.relation.ispartofseries Communications in Computer and Information Science
dc.title A comparative evaluation of statistical part-of-speech taggers for Russian
dc.type Conference Paper
dc.relation.ispartofseries-volume 505
dc.collection Публикации сотрудников КФУ
dc.relation.startpage 263
dc.source.id SCOPUS18650929-2015-505-SID84951851336


Files in this item

This item appears in the following Collection(s)

  • Публикации сотрудников КФУ Scopus [24551]
    Коллекция содержит публикации сотрудников Казанского федерального (до 2010 года Казанского государственного) университета, проиндексированные в БД Scopus, начиная с 1970г.

Show simple item record

Search DSpace


Advanced Search

Browse

My Account

Statistics