Skip to content

Releases: geekylink/PicoGPT

Usable set of scripts to train a GPT model on input.txt text and generate text based on a prompt for the trained model.

23 Jul 15:30
Compare
Choose a tag to compare

A very basic set of scripts to train a GPT model on input.txt text and generate text based on a prompt for the trained model.

These scripts grew out of an attempt to create a set of scripts to train and run medium-sized GPTs on older hardware. Most existing tools require very recent and expensive GPUs to run. I wanted to create something that can run on lower end hardware and enable a larger audience of individuals to experiment with training LLMs.

Train

First thing you need to do is train your model, this is two steps. The first is to tokenize the input data, then you need train for however many epochs

Prepare - Tokenize your data

Run train.py with --prepare and --input input.txt to tokenize data and prepare for training epochs.

Creates a directory: out/output.model/

python train.py --prepare --input input.txt [out/output.model]

Optional note: you can pass --model to use any other model provided by https://huggingface.co/models
You can use it like so: --model gpt2-xl

Train - Run, run, run...

Train for X epochs using input.model and save to output.model Then train again for more epochs until coherent.
out/output.model and out/output.model should be the same model to resume and continue training.

Note: change --batch-size for smaller/larger GPUs, default is 4.

python train.py --model [out/output.model] --epochs X [out/output.model]

Generate text with the model

python run.py [out/output.model] <prompt_text>