In Catalog, Cluster view helps you understand your data. Use it to:
- Explore relationships between data rows
- Identify edge cases and outliers
- Select for pre-labeling or human review
- Quickly classify large datasets in bulk
This page describes features currently in preview. Some improvements may not yet be documented and some behavior may change ahead of general availability.
Cluster view is a projection view of a dataset, one that groups assets by common characteristics. Like any other view, you can select data rows and use the Selection menu to perform tasks.
At this time, Cluster view is supported for image, text, and document datasets with more than 100 data rows. By default, Cluster view is limited to 500,000 data rows. (Contact Support if you're interested in larger datasets).
To display cluster view:
Use Catalog to select a dataset.
In the View control panel, select the Cluster view (beta) button.
If prompted select the Generate cluster view button to generate the cluster view.
Based on the size of the dataset and its assets, the initial cluster view can take several minutes to generate.
You can follow progress in the Notification Center.
You can do several things to manage a cluster view.
To change the cluster view zoom level, select the Zoom button and then use your system's zoom gestures.
To select an asset, simply click it. When you do this, a preview appears.
To select multiple assets in Cluster view:
- Select the Multi-select button
Hold the left mouse button to drag a selection rectangle around the assets to select.
When you do this, selected assets appear as blue dots and a preview window appears.
The arrow buttons on the preview window cycle between selected assets. The preview window's Close button also clears the selection.
When one or more assets are selected, you can use the Catalog Selection menu to manage the selected assets.
Use the Recompute button to update the cluster view.
Cluster view settings control how cluster rendering.
The cluster view panel includes the following settings:
Point size controls the size of the asset points displayed by cluster view.
You can choose between 1.0x, 4.0x, 8.0x and 20x.
Reduction algorithm controls how the cluster is calculated and includes the following settings:
Use the Cluster view settings button to hide or show the setting panel.
Cluster view is currently available for image, text, and document datasets.
Cluster view currently supports datasets with a minimum of 100 datarows and a maximum of 500,000 datarows. (Contact Support for help with larger datasets.)
Updated 18 days ago