Аннотации:
© Springer Nature Switzerland AG 2018. Medical Concept Coding (MCD) is a crucial task in biomedical information extraction. Recent advances in neural network modeling have demonstrated its usefulness in the task of natural language processing. Modern framework of sequence-to-sequence learning that was initially used for recurrent neural networks has been shown to provide powerful solution to tasks such as Named Entity Recognition or Medical Concept Coding. We have addressed the identification of clinical concepts within the International Classification of Diseases version 10 (ICD-10) in two benchmark data sets of death certificates provided for the task 1 in the CLEF eHealth shared task 2017. A proposed architecture combines ideas from recurrent neural networks and traditional text retrieval term weighting schemes. We found that our models reach accuracy of 75% and 86% as evaluated by the F-measure on the CépiDc corpus of French texts and on the CDC corpus of English texts, respectfully. The proposed models can be employed for coding electronic medical records with ICD codes including diagnosis and procedure codes.