Электронный архив

Generating training data for word sense disambiguation in Russian

Показать сокращенную информацию

dc.contributor.author Bolshina A.S.
dc.contributor.author Loukachevitch N.V.
dc.date.accessioned 2021-02-25T06:56:04Z
dc.date.available 2021-02-25T06:56:04Z
dc.date.issued 2020
dc.identifier.issn 2221-7932
dc.identifier.uri https://dspace.kpfu.ru/xmlui/handle/net/161572
dc.description.abstract © 2020 ABBYY PRODUCTION LLC. All rights reserved. The best approaches in Word Sense Disambiguation (WSD) are supervised and rely on large amounts of hand-labelled data, which is not always available and costly to create. For the Russian language there is no sense-tagged resource of the size sufficient to train supervised word sense disambiguation algorithms. In our work we describe an approach that is used to create an automatically labelled collection based on the monosemous relatives (related unambiguous entries). The main contribution of our work is that we extracted monosemous relatives that can be located at relatively long distances from a target ambiguous word and ranked them according to the similarity measure to the target sense. The selected candidates are then used to extract training samples from the news corpus. We evaluated word sense disambiguation models based on a nearest neighbor classification on BERT and ELMo embeddings. Our work relies on the Russian wordnet RuWordNet.
dc.relation.ispartofseries Komp'juternaja Lingvistika i Intellektual'nye Tehnologii
dc.subject Automatic Dataset Collection
dc.subject BERT
dc.subject ELMo
dc.subject Monosemous relatives
dc.subject Russian dataset
dc.subject Word sense disambiguation
dc.title Generating training data for word sense disambiguation in Russian
dc.type Conference Paper
dc.relation.ispartofseries-issue 19
dc.relation.ispartofseries-volume 2020-June
dc.collection Публикации сотрудников КФУ
dc.relation.startpage 119
dc.source.id SCOPUS22217932-2020-2020-19-SID85093820211


Файлы в этом документе

Данный элемент включен в следующие коллекции

  • Публикации сотрудников КФУ Scopus [24551]
    Коллекция содержит публикации сотрудников Казанского федерального (до 2010 года Казанского государственного) университета, проиндексированные в БД Scopus, начиная с 1970г.

Показать сокращенную информацию

Поиск в электронном архиве


Расширенный поиск

Просмотр

Моя учетная запись

Статистика