Open data in Catalog

After uploading model predictions to the Model product, and analyzing model predictions and model metrics, you may want to take action to improve your neural network.

For instance, you may have identified difficult or rare data. To improve the neural network, you are looking for similar data to include in your training sets.

How to find similar data from a model run

Step 1: Select data rows in the model run

Select high-value data rows in the model run. For example, these data rows might be difficult for the neural network, or they might correspond to a rare scenario. Your goal is to mine all of your existing data - labeled and unlabeled - to include similar data in your training sets.

In this example, we select images of green bananas - a rare scenario on which the neural network is struggling

In this example, we select images of green bananas - a rare scenario on which the neural network is struggling

Step 2: Open the data rows in Catalog

Click on Manage selection and View in catalog to open these high-value data rows in Catalog.

This opens Catalog and automatically creates a filter Showing results from model run XX

Step 3: Find similar data in Catalog

Once in Catalog, select the high-value data rows and click on Similar to selection to find similar data rows.

Then, remove the filter Showing results from model run XX so that the similarity search operates on your entire data catalog.

This surfaces images in Catalog that are most similar to your high-value images.

We have successfully surfaced many examples of green bananas, to improve our neural network

We have successfully surfaced many examples of green bananas, to improve our neural network

Step 4: Optionally, filter to keep only labeled or unlabeled data

You may want to keep only unlabeled data, to label it in priority and include it in your training data. To do so, add a filter Annotation > is None.

We surfaced unlabeled images of green bananas, to label in priority in order to boost our neural network

We surfaced unlabeled images of green bananas, to label in priority in order to boost our neural network

You may want to keep only labeled data, to include it directly in your training set. To do so, add a filter Annotation and/or Project with the annotations and/or labeling projects you are looking for.

We surfaced labeled images of green bananas, to include in the training data in order to boost our neural network

We surfaced labeled images of green bananas, to include in the training data in order to boost our neural network

Step 5: Optionally, refine the similarity search

You can refine the similarity search. Select the images you find most relevant, and then click on Add selection to anchors. These images will be added as anchor images in the similarity search. Learn more about how to add anchors to your similarity search.