Calculation of a confidence interval of semantic distance estimates obtained using a large diachronic corpus

Shevlyakova A.V.; Bochkarev V.V.

Calculation of a confidence interval of semantic distance estimates obtained using a large diachronic corpus

Bochkarev V.V.; Shevlyakova A.V.

URI: https://dspace.kpfu.ru/xmlui/handle/net/169555

Дата: 2021

Аннотации:

Several methods for detection changes in words semantics and appearance of new word meanings have been suggested. These methods use different techniques of estimating semantic distance between words. They are based both on neural network vector models and on simpler vector representations that use frequencies of n-grams including the studied words. This paper proposes a method for calculation the confidence interval of the semantic distance estimations obtained based on the frequency data of n-grams extracted from the large diachronic corpus. This task is complicated because the question about the law of distribution of frequency fluctuations of words and n-grams, despite a number of studies, remains open. The confidence intervals are calculated by statistic modeling using random permutations of n-gram frequencies. To test the proposed method, estimation of semantic distance between two Russian synonyms is used as an example.

Показать полную информацию

Файлы в этом документе

Имя: SCOPUS17426588-20 ...

Размер: 46.13Kb

Формат: PDF

Открыть

Данный элемент включен в следующие коллекции

Публикации сотрудников КФУ Scopus [24551]
Коллекция содержит публикации сотрудников Казанского федерального (до 2010 года Казанского государственного) университета, проиндексированные в БД Scopus, начиная с 1970г.