Электронный архив

On biomedical named entity recognition: Experiments in interlingual transfer for clinical and social media texts

Показать сокращенную информацию

dc.contributor.author Miftahutdinov Z.
dc.contributor.author Alimova I.
dc.contributor.author Tutubalina E.
dc.date.accessioned 2021-02-25T06:50:57Z
dc.date.available 2021-02-25T06:50:57Z
dc.date.issued 2020
dc.identifier.issn 0302-9743
dc.identifier.uri https://dspace.kpfu.ru/xmlui/handle/net/161071
dc.description.abstract © Springer Nature Switzerland AG 2020. Although deep neural networks yield state-of-the-art performance in biomedical named entity recognition (bioNER), much research shares one limitation: models are usually trained and evaluated on English texts from a single domain. In this work, we present a fine-grained evaluation intended to understand the efficiency of multilingual BERT-based models for bioNER of drug and disease mentions across two domains in two languages, namely clinical data and user-generated texts on drug therapy in English and Russian. We investigate the role of transfer learning (TL) strategies between four corpora to reduce the number of examples that have to be manually annotated. Evaluation results demonstrate that multi-BERT shows the best transfer capabilities in the zero-shot setting when training and test sets are either in the same language or in the same domain. TL reduces the amount of labeled data needed to achieve high performance on three out of four corpora: pretrained models reach 98–99% of the full dataset performance on both types of entities after training on 10–25% of sentences. We demonstrate that pretraining on data with one or both types of transfer can be effective.
dc.relation.ispartofseries Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.subject BERT
dc.subject Biomedical entity recognition
dc.subject Transfer learning
dc.title On biomedical named entity recognition: Experiments in interlingual transfer for clinical and social media texts
dc.type Conference Paper
dc.relation.ispartofseries-volume 12036 LNCS
dc.collection Публикации сотрудников КФУ
dc.relation.startpage 281
dc.source.id SCOPUS03029743-2020-12036-SID85084182140


Файлы в этом документе

Данный элемент включен в следующие коллекции

  • Публикации сотрудников КФУ Scopus [24551]
    Коллекция содержит публикации сотрудников Казанского федерального (до 2010 года Казанского государственного) университета, проиндексированные в БД Scopus, начиная с 1970г.

Показать сокращенную информацию

Поиск в электронном архиве


Расширенный поиск

Просмотр

Моя учетная запись

Статистика