Labelbox recommends splitting your labeled dataset into three sets: training, validation, and test. Doing so greatly reduces your chances of overfitting your model.

See Curate data splits to learn how to assign data rows to splits, inside a Model Run.

Filter data by split

By default, when opening a model run, you see all the data it contains under All training data.

You can then display only the data corresponding to a specific split, by clicking on Splits and then clicking on the split you care about: Train, Validation, or Test.

Data rows that are in the model run, but that are not assigned to any split, will show up under All training data, but not under any split.

Visualize the distribution of your data by split

You can visualize the distribution of your data in each data splits in the projector view. This helps assess whether data splits share similar distribution.

Selected data rows (here, the training set) show up in orange

Selected data rows (here, the training set) show up in orange

You can also color data rows by class and see how separable the classes are. Click on the projector view icon. Select the data split you want to view. You can pick a class in that data split to color by clicking the color palette icon and selecting the class name.

What’s Next