Skip to content

Latest commit

 

History

History
31 lines (19 loc) · 1.37 KB

DATASETS.md

File metadata and controls

31 lines (19 loc) · 1.37 KB

Datasets

If a pre-trained model for the task you want to perform is not available, you can train Luminoth with an existing open dataset, or your own.

The first step in training Luminoth is converting your dataset to TensorFlow's .tfrecords format. This ensures that no matter what image or annotation formats the original dataset uses, it will be transformed to something that Luminoth can understand and process efficiently, either while training locally or in the cloud.

For this purpose, Luminoth provides the lumi dataset transform command, which includes support for some of the most well-known datasets for object detection and classification tasks.

Supported datasets

$ lumi dataset transform --type pascalvoc --data-dir ~/dataset/pascalvoc/ --output-dir ~/dataset/pascalvoc/tf/
$ lumi dataset transform --type imagenet --data-dir ~/dataset/imagenet/ --output-dir ~/dataset/imagenet/tf/

Limiting the dataset

During development, it is often useful to verify that the model can actually overfit a small dataset.

You can use the --limit-examples and --limit-classes` options for this.

For more information, try lumi dataset transform --help.

Supporting your own dataset

TODO guidelines on how to write your own conversion tool