Fastaudio

FastAudio is a Learnable Audio Frontend team Magnum's designed for the ASVspoof 2021 challenge. It was developed using the Speechbrain framework. The solution was produced by Quchen Fu and Zhongwei Teng, researchers in the Magnum Research Group at Vanderbilt University. The Magnum Research Group is part of the Institute for Software Integrated Systems.

The ASVspoof 2021 Competition challenges teams to develop countermeasures capable of discriminating between bona fide and spoofed or deepfake speech. The model achieved a 0.2531 min t-DCF score in LA Track on the open Leaderboard.

Requirements

Show details

speechbrain==0.5.7
pandas
wandb
torch==1.8.0+cu111
torchaudio==0.8.0
nnAudio==0.2.6

How it works

Environment

Create a virtual environment with python3.8 installed(virtualenv)
git clone --recursive https://github.com/QuchenFu/Fastaudio
use pip install -r requirements.txt to install the requirements files.
cd leaf-audio-pytorch/ and pip install -e .
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html

Data pre-processing

.
├── data                       
│   │
│   ├── PA                  
│   │   └── ...
│   └── LA           
│       ├── ASVspoof2019_LA_asv_protocols
│       ├── ASVspoof2019_LA_asv_scores
│       ├── ASVspoof2019_LA_cm_protocols
│       ├── ASVspoof2019_LA_train
│       ├── ASVspoof2019_LA_dev
│       └── ASVspoof2021_LA_eval
│
└── Fastaudio

Download the data here
Unzip and save the data to a folder data in the same directory as Fastaudio
python3.8 preprocess.py
Change args['data_type'] = ['labeled','unlabeled'][1] in preprocess.py to args['data_type'] = ['labeled','unlabeled'][0]
python3.8 preprocess.py

Train

python3.8 train_spoofspeech.py yaml/SpoofSpeechClassifier.yaml --data_parallel_backend --data_parallel_count=2

Inference

Modify the TRAIN in train_spoofspeech.py to False.
python3.8 train_spoofspeech.py yaml/SpoofSpeechClassifier.yaml --data_parallel_backend --data_parallel_count=2

Evaluate

python3.8 eval.py

Metrics

Accuracy metric

min t−DCF =min{βPcm (s)+Pcm(s)}

Star History

Reference

If you use this repository, please consider citing:

@inproceedings{Fu2021FastAudioAL,
  title={FastAudio: A Learnable Audio Front-End for Spoof Speech Detection},
  author={Quchen Fu and Zhongwei Teng and Jules White and M. Powell and Douglas C. Schmidt},
  booktitle={2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2022},
  organization={IEEE}
}

@inproceedings{Teng2021ComplementingHF,
  title={Complementing Handcrafted Features with Raw Waveform Using a Light-weight Auxiliary Model},
  author={Zhongwei Teng and Quchen Fu and Jules White and M. Powell and Douglas C. Schmidt},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
datasets		datasets
leaf-audio-pytorch @ 33f4ba4		leaf-audio-pytorch @ 33f4ba4
losses		losses
models		models
predictions		predictions
processed_data		processed_data
tdfbanks @ cb32151		tdfbanks @ cb32151
yaml		yaml
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
__init__.py		__init__.py
eval.py		eval.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
train_spoofspeech.py		train_spoofspeech.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fastaudio

Requirements

How it works

Environment

Data pre-processing

Train

Inference

Evaluate

Metrics

Accuracy metric

Star History

Reference

About

Releases

Packages

Languages

magnumresearchgroup/Fastaudio

Folders and files

Latest commit

History

Repository files navigation

Fastaudio

Requirements

How it works

Environment

Data pre-processing

Train

Inference

Evaluate

Metrics

Accuracy metric

Star History

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages