MS-COCO is a popular JSON-based format for segmentation, object detection (bbox), and instance segmentation tasks. Popular frameworks that use this framework include Detectron2. To learn more about how to train your own Detectron2 model with Labelbox data, check out these guides for panoptic segmentation and object detection.

Labelbox currently supports converting data into both the COCO object and panoptic formats. Both formats support instance detection. The primary difference is that the object detection format encodes objects as polygons which allows for overlapping instances. The panoptic dataset, however, encodes each class at the pixel level. This allows users to encode well defined objects as instances and amorphous background classes as individual pixels. When determining which format to use, consider whether or not you need to model amorphous background classes. If not, use the object detection format. The object format is also the original coco format and will have more open source tooling that is compatible with it.

Caveats:

  • name field must be present for the annotations
  • nested classifications are not supported

Convert object labels to COCO format

from labelbox import Client
from labelbox.data.annotation_types import Label, LabelList, ImageData, Point, ObjectAnnotation, Rectangle, Polygon
from labelbox.data.serialization import COCOConverter

client = Client(api_key="<YOUR_API_KEY>")

project = client.get_project("<PROJECT_ID>")
labels = project.label_generator()

mask_path = "./masks/"
image_path = './images/'

coco_labels = COCOConverter.serialize_panoptic(
    labels,
    image_root=image_path,
    mask_root=mask_path,
    ignore_existing_data=True
)