Stutter Classification

Throughout this project, an approach to identify and classify speech dysfluencies is being formed. The approach used describes the use of MFCC audio features amalgamated with Decision Tree and K-Nearest Neighbors for accomplishing this task. After working on several approaches, it can be concluded that the SEP-28k Dataset can be successfully used for classifying Speech Dysfluencies separately into Word Repetition, Sound Repetition and Prolongation. It was also discovered that not all rows are useful from the SEP-28k Dataset for every dysfluency and thus the appropriate amount of rows/data required were pointed out successfully and uploaded to Kaggle : SEP-28k MFCC.

Results :

After using various methods to train the models, the following accuracies were achieved :

SoundRep	WordRep	Prolongation
89.53 - Decision Tree	86.11 - Decision Tree	66.83 - K-Nearest Neighbors

Approach Used :

After using various approaches including Sampling techniques, the following steps need to be followed for training Stutter Recognition and Detection systems using SEP-28k Dataset :

Remove empty audio files and their corresponding entires from the dataset
Remove rows with Poor Audio Quality, Background Music and which are in general difficult to understand
Remove other rows with even one clinician labelling it as some other dysfluency
Considering labels 1, 2 and 3 as 1, so as to convert the multi-class classification into binary classification
Input files to be split into small clips of 3 seconds each, then running model on each clip and analysing the percentage of clips with a particular dysfluency

Features and Benefits :

Detection and classification of speech dysfluencies like Word Repetitions, Sound Repetitions and Prolongations
Preprocessed dataset along with the corresponding MFCC features is uploaded to Kaggle : SEP-28k MFCC. Hence, there is no need to preprocess the dataset while using it.
Severity of stuttering is to be judged on the basis of the number of clips with stuttering instead of the number of clinicians labelling the clip as a particular dysfluency
Anyone can use the models trained for classifying speech dysfluencies.
The models use Binary Classification, which makes it easy to understand for anyone that the input has a particular dysfluency or not.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
stutter-sep28k.ipynb		stutter-sep28k.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stutter Classification

Results :

Approach Used :

Features and Benefits :

About

Languages

mitul-garg/stutter-classification

Folders and files

Latest commit

History

Repository files navigation

Stutter Classification

Results :

Approach Used :

Features and Benefits :

About

Topics

Resources

Stars

Watchers

Forks

Languages