ICD10 Code Extraction from Death Certificates

Image credit: Unsplash

Résumé

This paper describes the participation of ECSTRA-INSERM team at CLEF eHealth 2016, task 2.C. The task involves extracting ICD10 codes from death certificates, mainly described with short plain texts. We cast the task as a machine learning problem involving the prediction of the ICD10 codes (categorical variable) from the raw text transformed into a bag-of-words matrix. We rely on probabilistic topic models that we evaluate against classical classifiers such as SVM and Naive Bayes. We demonstrate the effectiveness of topic models for this task in terms of prediction accuracy and result interpretation.

Publication
Journal of Source Themes, 1(1)
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Click the Slides button above to demo Academic’s Markdown slides feature.

Supplementary notes can be added here, including code and math.

Namik Taright
Namik Taright
Médecin responsable de l’Information médicale

Mes travaux portent sur l’utilisation de l’information hospitalière à des fins de pilotage de l’activité et de son financement.

Sur le même sujet