Training Data

You can see some very interesting insights about the training data, TLT-Trainer automatically generates this for you.

Send a get request to /api/v1/dataset-stats, you will receive a JSON string that contains all the information related to training data.

curl http://localhost:1004/api/v1/dataset-stats

Expected response:

  • {"message": "analyzing dataset"} with a status code of 202.

  • status code 200 with JSON string in the below structure.

{
"null_examples": 8,
"image_ratio_distribution": {
"ydata": [
0.1,
0.2
],
"marker_size": [
20,
20
],
"xdata": [
2,
2
],
"ylabel": "Ratio",
"colors": [
[
[
128,
0,
255
],
[
118,
16,
255
]
]
],
"xlabel": "Number of Images",
"title": "Image Ratio Distribution",
"annotate": [
{
"y": 0.1,
"text": "2",
"x": 2
},
{
"y": 0.2,
"text": "2",
"x": 2
}
]
},
"square_images": 609,current epoch
"annotation_per_class": {
"slice": {
"face": 159424,
"person": 180252
},
"percent": {
"face": 0.46934137236660817,
"person": 0.5306586276333918
},
"colors": {
"face": [
128,
0,
255
],
"person": [
255,
0,
0
]
},
"title": "Annotation per class"
},
"annotation_heatmap": {
"patches": [
{
"y2": 313,
"label": "face",
"x2": 96,
"y1": 288,
"x1": 74
},
{
"y2": 290,
"label": "face",
"x2": 498,
"y1": 273,
"x1": 485
}
],
"colors": {
"face": [
128,
0,
255
],
"person": [current epoch
255,
0,
0
]
}
},
"missing_annotations": 0,
"wider_images": 17942,
"taller_images": 3321,
"image_per_class": {
"slice": {
"face": 12880,
"person": 8992
},
"percent": {
"face": 0.5888807607900512,
"person": 0.4111192392099488
},
"colors": {
"face": [
128,
0,
255
],
"person": [
255,
0,
0
]
},
"title": "Image per class"
}
}

The above data when being plotted, reveals stunning insights about training data.

class distribution
ratio distribution
annotation heatmap

In the next section, you will learn about how to stop training gracefully.