This repository hosts two text mining projects:
- Text Recommender: Enhances user experience by using the Natural Language Toolkit (NLTK) to correct spelling mistakes and suggest accurate words, mimicking the 'Did you mean...?' feature on search platforms.
- Message Spam Detection: Employs various machine learning models to classify messages as spam or not, aiming to improve filtering accuracy and user experience.
- Spelling Correction: Employs phonetic matching and distance metrics for spelling suggestions. Integration Capability: Designed for integration into various platforms to enhance functionality.
- Spam Detection Models: Uses Naive Bayes, Logistic Regression, and SVM for spam identification. Advanced Feature Engineering: Incorporates Count Vectorization and TF-IDF Vectorization with features such as document length, digit count, and non-word character count.
- Model Evaluation: Utilizes ROC-AUC scores for evaluating model performance.
NLTK Scikit-learn Pandas NumPy CountVectorizer TfidfVectorizer
Thank You For reviewing this repository, I appreciate your feedback!