Computing syntactic parameters for automated text complexity assessment

Solovyev V.; Solnyshkina M.; Ivanov V.; Rygaev I.

Computing syntactic parameters for automated text complexity assessment

Solovyev V.; Solnyshkina M.; Ivanov V.; Rygaev I.

URI: https://dspace.kpfu.ru/xmlui/handle/net/156062

Date: 2019

Abstract:

Copyright © 2019 for this paper by its authors. The article focuses on identifying, extracting and evaluating syntactic parameters influencing the complexity of Russian academic texts. Our ultimate goal is to select a set of text features effectively measuring text complexity and build an automatic tool able to rank Russian academic texts according to grade levels. models based on the most promising features by using machine learning methods The innovative algorithm of designing a predictive model of text complexity is based on a training text corpus and a set of previously proposed and new syntactic features (average sentence length, average number of syllables per word, the number of adjectives, average number of participial constructions, average number of coordinating chains, path number, i.e. average number of sub-trees). Our best model achieves an MSE of 1.15. Our experiments indicate that by adding the abovementioned syntactic features, namely the average number of participial constructions, average number of coordinating chains, and the average number of sub-trees, the text complexity model performance will increase substantially.

Show full item record

Files in this item

Name: SCOPUS16130073-20 ...

Size: 47.95Kb

Format: PDF

View/Open

This item appears in the following Collection(s)

Публикации сотрудников КФУ Scopus [24551]
Коллекция содержит публикации сотрудников Казанского федерального (до 2010 года Казанского государственного) университета, проиндексированные в БД Scopus, начиная с 1970г.