Aggressor: Ultra-minimal autoregressive diffusion model for image and speech generation

CIFAR	MNIST	AUDIO
		wav_aggressor.mp4

Key Features

Simple Architecture: A tiny transformer for autoregression and an MLP for diffusion.
Minimal Dependencies: Built from scratch using only basic MLX operations.
Single-File Implementation: Entire model in one Python file aggressor.py.

python aggressor.py

(Training on 60000 images x 20 epochs takes approximately 7~8 minutes on 8GB M2 MacBook.)

Thanks to lucidrains' fantastic code that inspired this project. The official implementation is available here.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
LICENSE		LICENSE
README.md		README.md
aggressor.py		aggressor.py