How to find similar data from a model run
Step 1: Select data rows in the model run
Select high-value data rows in the model run. For example, these data rows might be difficult for the neural network, or they might correspond to a rare scenario. Your goal is to mine all of your existing data - labeled and unlabeled - to include similar data in your training sets.
In this example, we select images of green bananas - a rare scenario on which the neural network is struggling
Step 2: Open the data rows in Catalog
Click on Manage selection and View in catalog to open these high-value data rows in Catalog.

Step 3: Find similar data in Catalog
Once in Catalog, select the high-value data rows and click on Similar to selection to find similar data rows.


We have successfully surfaced many examples of green bananas, to improve our neural network
Step 4: Optionally, filter to keep only labeled or unlabeled data
You may want to keep only unlabeled data, to label it in priority and include it in your training data. To do so, add a filter Annotation > is None.
We surfaced unlabeled images of green bananas to label in priority in order to boost our neural network

We surfaced labeled images of green bananas, to include in the training data in order to boost our neural network
Step 5: Optionally, refine the similarity search
You can refine the similarity search. Select the images you find most relevant, and then click on Add selection to anchors. These images will be added as anchor images in the similarity search. Learn more about how to add anchors to your similarity search.