In order to start training a model you need two things
Dataset in KITTI format
If you have not already prepared your dataset please follow the instructions from the Dataset Preparation section.
This section is focused on building and running the TLT-Trainer container.
Clone the TLT-Trainer repository locally
git clone https://www.smartcow.dev/SmartCow/TLT-Trainer.gitcd TLT-Trainer
Run the build command
docker build -t tlt_trainer .
Run the container
Make sure you have already prepared the dataset. To run the container you need to pass following parameters.
volume mount for dataset path and project path
port number to access the Trainer API
nvidia-docker run --rm --gpus all -it \-v /home/user/dataset:/dataset \-v /home/user/project:/project \-p 1004:5000 \--name tlt-ssd-resnet18 --hostname tlt \tlt_trainer python run.py
Optionally you can also pass an environment variable
TRAINER_CONFIG with JSON format training configuration and replace
entrypoint.py in the nvidia-docker run command to automatically start training when the container is started, we will discuss it in the later section.
The next section shows an example of running the training and retrieving live training stats through the API.