Skip to content

Commit

Permalink
Update data pipeline for creating dictionary
Browse files Browse the repository at this point in the history
  • Loading branch information
woodthom2 committed Aug 7, 2024
1 parent eed3358 commit 3391ea6
Show file tree
Hide file tree
Showing 8 changed files with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ def get_brand_names_nhs(description: str):

print("Finding all drugs that are also in the NLTK list of English words.")

all_english_vocab = set(words.words())
all_english_vocab = set([w.lower() for w in words.words()])

words_to_check_with_ai = set()
for word in list(drug_variant_to_canonical):
Expand Down
Binary file modified src/drug_named_entity_recognition/drug_ner_dictionary.pkl.bz2
Binary file not shown.

0 comments on commit 3391ea6

Please sign in to comment.