Breif History
In my experience, i have encountered a lot of time put into cleaning data especially with NLP related problems. So i thought to build a generalized data cleaning mechanisms with which we can directly plug-in the methods for preprocessing and directly building various models or we can select required methods, customize and use them in your own code.
Cleaning Mechanisms
The Mechanisms that are included in cleaningmechanism.py.
- Word/Phrase Embeddings Cleaner
- Named Entity Recognition Cleaner
- Text Summarization Cleaner
- Text Generation Cleaner
Future Scope
Pending Topics/Methods To Cover.
- Spelling Correction
- Join Split Words
- Split Join Words
- Grammer Correction
- Normalize Slang Words