Performed Frame Level Classification of Speech to identify 40 types of phonnemes from their utterance.
Data can be found here: https://www.kaggle.com/competitions/11-785-s22-hw1p2
The data provided consists of these melspectrograms, and phoneme labels for each 13-dimensional vector in the melspectrogram. Predicted the label of a particular 13-dimensional vector in an utterance.