dc.contributor.author |
Ivanov V. |
|
dc.contributor.author |
Solovyev V. |
|
dc.date.accessioned |
2021-02-25T20:37:42Z |
|
dc.date.available |
2021-02-25T20:37:42Z |
|
dc.date.issued |
2020 |
|
dc.identifier.issn |
1064-1246 |
|
dc.identifier.uri |
https://dspace.kpfu.ru/xmlui/handle/net/162083 |
|
dc.description.abstract |
© 2020 - IOS Press and the authors. All rights reserved. Creation of dictionaries of abstract and concrete words is a well-known task. Such dictionaries are important in several applications of text analysis and computational linguistics. Usually, the process of assembling of concreteness scores for words begins with a lot of manual work. However, the process can be automated significantly using information from large corpora. In this paper we combine two datasets: a dictionary with concreteness scores of 40,000 English words and the GoogleBooks Ngram dataset, in order to test the following hypothesis: in text concrete words tend to occur with more concrete words, than with abstract words (and inverse: abstract words tend to occur with more abstract words, than with concrete words). Using the hypothesis, we proposed a method for automatic evaluation concreteness scores of words using a small amount of initial markup. |
|
dc.relation.ispartofseries |
Journal of Intelligent and Fuzzy Systems |
|
dc.subject |
bigrams |
|
dc.subject |
Concreteness of words |
|
dc.subject |
dictionary |
|
dc.title |
Ranking concrete and abstract words using Google Books Ngram data |
|
dc.type |
Article |
|
dc.relation.ispartofseries-issue |
2 |
|
dc.relation.ispartofseries-volume |
39 |
|
dc.collection |
Публикации сотрудников КФУ |
|
dc.relation.startpage |
2229 |
|
dc.source.id |
SCOPUS10641246-2020-39-2-SID85091090630 |
|