Thesaurus-Based Methods for Assessment of Text Complexity in Russian

dc.contributor.author Solovyev V.
dc.contributor.author Ivanov V.
dc.contributor.author Solnyshkina M.
dc.date.issued 2020
dc.identifier.issn 0302-9743
dc.description.abstract © 2020, Springer Nature Switzerland AG. The study explores the problem of assessing complexity of Russian educational texts. In this paper, we focus on measuring conceptual complexity which is rarely selected as a research question and propose to use a thesaurus (or a linguistic ontology) to this end. We also compiled an original corpus of school textbooks on Social Studies, History used in high school, and textbooks for elementary school specifically for this set of text complexity experiments. On the first stage of the research, RuThes-Lite thesaurus, a linguistic knowledge base with the total size of 100,000 concepts, was used to elicit concepts in the texts of schoolbooks and represent them as graphs. To the best of our knowledge, we a new method for text complexity assessment using RuThes-Lite graphs and identify graphs-based semantic characteristics of texts that impact complexity. The most significant findings of the research include identification of statistically significant correlations of the selected features, such as node degree, with complexity of educational texts.
dc.relation.ispartofseries Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.subject Russian language
dc.subject Text complexity
dc.subject Thesaurus
dc.title Thesaurus-Based Methods for Assessment of Text Complexity in Russian
dc.type Conference Paper
dc.relation.ispartofseries-volume 12469 LNAI
dc.relation.startpage 152
dc.source.id SCOPUS03029743-2020-12469-SID85092920549

