dc.contributor.author |
Gareev R. |
|
dc.contributor.author |
Ivanov V. |
|
dc.date.accessioned |
2018-09-18T20:27:35Z |
|
dc.date.available |
2018-09-18T20:27:35Z |
|
dc.date.issued |
2015 |
|
dc.identifier.issn |
1865-0929 |
|
dc.identifier.uri |
https://dspace.kpfu.ru/xmlui/handle/net/140105 |
|
dc.description.abstract |
© Springer International Publishing Switzerland 2015. Part-of-speech (POS) tagging is an essential step in many text processing applications. Quite a few works focus on solving this task for Russian; their results are not directly comparable due to the lack of shared datasets and tools. We propose a POS tagging evaluation framework for Russian that comprises existing third-party resources available for researchers. We applied the framework to compare several implementations of statistical classifiers: HunPos, Stanford POS tagger, OpenNLP implementation of MaxEnt Markov Model, and our own reimplementation of Tiered Conditional Random Fields. The best tagger that was trained on a corpus with less than one million words achieved an accuracy above 93% .We expect that the evaluation framework will facilitate future studies and improvements on POS tagging for Russian. |
|
dc.relation.ispartofseries |
Communications in Computer and Information Science |
|
dc.title |
A comparative evaluation of statistical part-of-speech taggers for Russian |
|
dc.type |
Conference Paper |
|
dc.relation.ispartofseries-volume |
505 |
|
dc.collection |
Публикации сотрудников КФУ |
|
dc.relation.startpage |
263 |
|
dc.source.id |
SCOPUS18650929-2015-505-SID84951851336 |
|