The DReaM corpus: A multilingual annotated corpus of grammars for the world's languages

Hammarström H.; Virk S.M.; Wichmann S.; Forsberg M.

The DReaM corpus: A multilingual annotated corpus of grammars for the world's languages

Virk S.M.; Hammarström H.; Forsberg M.; Wichmann S.

URI: https://dspace.kpfu.ru/xmlui/handle/net/161498

Дата: 2020

Аннотации:

© European Language Resources Association (ELRA), licensed under CC-BY-NC There exist as many as 7000 natural languages in the world, and a huge number of documents describing those languages have been produced over the years. Most of those documents are in paper format. Any attempts to use modern computational techniques and tools to process those documents will require them to be digitized first. In this paper, we report a multilingual digitized version of thousands of such documents searchable through some well-established corpus infrastructures. The corpus is annotated with various meta, word, and text level attributes to make searching and analysis easier and more useful.

Показать полную информацию

Файлы в этом документе

Имя: SCOPUS-2020-SID85 ...

Размер: 46.85Kb

Формат: PDF

Открыть

Данный элемент включен в следующие коллекции

Публикации сотрудников КФУ Scopus [24551]
Коллекция содержит публикации сотрудников Казанского федерального (до 2010 года Казанского государственного) университета, проиндексированные в БД Scopus, начиная с 1970г.