Skip to content

Given a medical diagnosis, identifying medical conditions within the text through named entity linking and mapping them to standardized medical encodings using BERT based models. Task: https://temu.bsc.es/distemist/

Notifications You must be signed in to change notification settings

Megha-Bose/Disease-NER

Repository files navigation

Disease-NER

Given a medical diagnosis, identifying medical conditions within the text and mapping them to standardized medical encodings.

Data

The data directory contains:

  • The disease mentions from the text files stored in entities.tsv.
  • Text files containing the medical textual data in the text directory.

The data is taken from the English version of multilingual resources of the DisTEMIST 2022 task: https://zenodo.org/record/6532684

Pre-processing

The pre-processing stage involves:

  • Splitting medical text in each file into sentences.

  • Tokenizing the sentences into words/tokens.

  • Calculating IOB tags for the tokens for named entity recognition (NER) task.

  • Code: Pre-processing.ipynb

NER Task

Entity Linking Task

About

Given a medical diagnosis, identifying medical conditions within the text through named entity linking and mapping them to standardized medical encodings using BERT based models. Task: https://temu.bsc.es/distemist/

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published