> ## Documentation Index
> Fetch the complete documentation index at: https://docs.labelbox.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Project

> Developer guide for creating and modifying projects using the Python SDK.

<CardGroup>
  <Card title="Open in Colab" icon="infinity" iconType="solid" horizontal href="https://colab.research.google.com/github/Labelbox/labelbox-notebooks/blob/main/basics/projects.ipynb" />

  <Card title="GitHub" icon="github" iconType="solid" horizontal href="https://github.com/Labelbox/labelbox-notebooks/blob/main/basics/projects.ipynb" />
</CardGroup>

## Client

<CodeGroup>
  ```python Python theme={null}
  import labelbox as lb
  client = lb.Client(api_key="<YOUR_API_KEY>")
  ```
</CodeGroup>

## Create a project

When creating a project, specify a `media_type` using one of the following values:

* `lb.MediaType.Audio`
* `lb.MediaType.Conversational`
* `lb.MediaType.Document`
* `lb.MediaType.Geospatial_Tile`
* `lb.MediaType.Html`
* `lb.MediaType.Image`
* `lb.MediaType.Simple_Tile`
* `lb.MediaType.Text`
* `lb.MediaType.Video`

<CodeGroup>
  ```python Create a project theme={null}
  # Create a new project
  project = client.create_project(
      name="<project_name>",
      description="<project_description>",    # optional
      media_type=lb.MediaType.Image           # specify the media type
  )
  ```
</CodeGroup>

## Get a project

<CodeGroup>
  ```python Python theme={null}
  project = client.get_project("<project_id>")

  # alternatively, you can get a dataset by name
  project = client.get_projects(where=lb.Project.name == "<project_name>").get_one()
  ```
</CodeGroup>

## Methods

### Create a batch

When creating a batch to send to a project, one of either `global_keys` or `data_rows` must be supplied as an argument. If using the `data_rows` argument, you can supply either a list of data row IDs or a list of `DataRow` class objects.

Optionally, you can supply a `priority`, ranging from 1 (highest) to 5 (lowest), for which the batch should be labeled. This will determine the order in which the included data rows appear in the labeling queue compared to other batches. If no value is provided, the batch will assume the lowest priority.

For more details, see [Batch](/reference/batch).

<CodeGroup>
  ```python Python theme={null}
  project.create_batch(
    name="<unique_batch_name>",
    global_keys=["key1", "key2", "key3"],
    priority=5,
  )

  # if the project uses consensus, you can optionally supply a dictionary with consensus settings
  # if provided, the batch will use consensus with the specificed coverage and votes
  project.create_batch(
    name="<unique_batch_name>",
    data_rows=["<data_row_id>", "<data_row_id>"],
    priority=1,
    consensus_settings={"number_of_labels": 3, "coverage_percentage": 0.1}
  )
  ```
</CodeGroup>

#### Create multiple batches

The `project.create_batches()` method accepts up to 1 million data rows. Batches are chunked into groups of 100k data rows (if necessary), which is the maximum batch size.

This method takes in a list of either data row IDs or `DataRow`objects into a `data_rows` argument or global keys into a `global_keys` argument, but both approaches cannot be used in the same method. Batches will be created with the specified `name_prefix` argument and a unique suffix to ensure unique batch names. The suffix will be a 4-digit number starting at `0000`.

For example, if the name prefix is `demo-create-batches-` and three batches are created, the names will be `demo-create-batches-0000`, `demo-create-batches-0001`, and `demo-create-batches-0002`. This method will throw an error if a batch with the same name already exists.

<CodeGroup>
  ```python Python theme={null}
  task = project.create_batches(
    name_prefix="demo-create-batches-",
    global_keys=global_keys,
    priority=5
  )

  print("Errors: ", task.errors())
  print("Result: ", task.result())
  ```
</CodeGroup>

#### Create batches from a dataset

If you wish to create batches in a project using all the data rows of a dataset, instead of gathering global keys or IDs and iterating over subsets of data rows, you can use the `project.create_batches_from_dataset()` method.

This method takes in a dataset ID and creates a batch (or batches if there are more than 100k data rows) comprised of all data rows not already in the project. The same logic applies to the `name_prefix` argument and the naming of batches as described in the section immediately above.

<CodeGroup>
  ```python Python theme={null}
  dataset = client.get_dataset("<dataset_id>")

  task = project.create_batches_from_dataset(
      name_prefix="demo-dataset-",
      dataset_id=dataset.uid,
      priority=5
  )

  print("Errors: ", task.errors())
  print("Result: ", task.result())
  ```
</CodeGroup>

### Get the batches

<CodeGroup>
  ```python Python theme={null}
  # get the batches (objects of the Batch class)
  batches = project.batches()

  # inspect one batch
  next(batches)

  # inspect all batches
  for batch in batches:
    print(batch)

  # for ease of use, you can convert the paginated collection to a list
  list(batches)
  ```
</CodeGroup>

### Get a batch

You can retrieve the batch of a particular project with`client.get_batch()`.

<CodeGroup>
  ```python Python theme={null}
  project_id = "<project_id>"
  batch_id = "<batch_id>"

  # returns a Batch object
  batch = client.get_batch(project_id, batch_id)
  ```
</CodeGroup>

### Connect an ontology

<CodeGroup>
  ```python Python theme={null}
  # the argument must be an object of the Ontology class
  project.connect_ontology(ontology)
  ```
</CodeGroup>

### Get the members and their roles

The scope of a member is provided by the attribute `access_from` from the class `ProjectMember`.

It can have one of the following values:

* ORGANIZATION: project membership is derived from the organization role
* PROJECT\_MEMBERSHIP: access is given specifically to the project
* USER\_GROUP: access is given via a group

<CodeGroup>
  ```python Python theme={null}
  # get the members (objects of the ProjectMember class with relationships to a User and Role)
  members = project.members()

  # inspect one member
  member = next(members)
  print(member.user(), member.role(), member.access_from)

  # Display member info:
  print(member.user().uid, member.user().email, member.role().name, access_from, sep="\t")

  # inspect all members
  for member in members:
    print(member.user(), member.role(), access_from)

  # for ease of use, you can convert the paginated collection to a list
  list(members)
  ```
</CodeGroup>

### Upload labeling instructions

Note that if the ontology connected to your project is connected to other projects, calling this method will attach the instructions to those projects as well.

<CodeGroup>
  ```python Python theme={null}
  # must be a PDF or HTML file
  project.upsert_instructions("<local_file_path>")
  ```
</CodeGroup>

### Get the workflow tasks

<CodeGroup>
  ```python Python theme={null}
  # get the task queues (relationship to TaskQueue objects)
  task_queues = project.task_queues()

  # inspect all task queues
  for task_queue in task_queues:
    print(task_queue)
  ```
</CodeGroup>

### Move data rows to a workflow task

Note that data rows need labels attached before being moved to a different workflow task. They can not be moved from "Initial Labeling."

<CodeGroup>
  ```python Python theme={null}
  project.move_data_rows_to_task_queue(
    data_row_ids=lb.GlobalKeys(["<global_key>", "<global_key>"]), # Use "lb.UniqueIds" for "<data_row_ids>"
    # Optional: If not included, defaults to "None", which moves data rows to the "Done" bucket
    task_queue_id="<task_queue_id>"
  )
  ```
</CodeGroup>

### Modify data row priority

Once a batch has been added to a project, you can set [the priority](/docs/labeling-queue) of its data rows. To do so, define a list of label parameter overrides (LPOs), which are [tuples](https://docs.python.org/3/library/ast.html?highlight=tuple#ast.Tuple) that set the priority for individual data rows.

Each override has three values: an object of the `DataRow` class or a `DataRowIdentifier` object, the new priority. All values must be integers that match the range of the list.

The priority is an integer between `-2,147,483,648` to `2,147,483,647`. The lowest value has the highest priority.

Override lists are limited to 1,000 items; larger lists trigger an error.

Once the override list is defined, pass it to `project.set_labeling_parameter_overrides` to change the priority of the corresponding data rows. Use `project.labeling_parameter_overrides` to get a list of [data row priorities](#get-data-row-priority) and `project.update_data_row_labeling_priority` to update existing data row priority.

#### Set data row priority

<CodeGroup>
  ```python Python (add LPOS) theme={null}
  # Extract the global keys
  export_params = {
      "data_row_details": True
  }

  export_task = project.export(params=export_params)

  # Wait until the export task is complete
  export_task.wait_till_done()

  # Stream the export using a callback function
  def json_stream_handler(output: labelbox.BufferedJsonConverterOutput):
    print(output.json)

  export_task.get_buffered_stream(stream_type=labelbox.StreamType.RESULT).start(stream_handler=json_stream_handler)

  # Collect all exported data into a list
  export_json = [data_row.json for data_row in export_task.get_buffered_stream()]

  global_keys = [item["data_row"]["global_key"] for item in export_json]

  # Add LPOs
  lpos = []
  priority=1

  # With global keys
  for global_key in global_keys:
    lpos.append((lb.GlobalKey(global_key), priority))
    priority += 1

  # With data row ids
  # data_row_ids = ["clw7jlmav35yn0768xrpawwrc", "clw7jlmav35yo0768a5amfztu"]
  # for dr_id in data_row_ids:
  #   lpos.append((lb.UniqueId(dr_id), priority))
  #   priority+=1

  # With data row objects
  # data_rows = [data_row_1, data_row_2]

  # for data_row in data_rows:
  #   lpos.append((data_row, priority))
  #   priority+=1

  # Set data row priorities
  project.set_labeling_parameter_overrides(lpos)

  # Check results
  project_lpos = list(project.labeling_parameter_overrides())
  for lpo in project_lpos:
    print(lpo)
  ```
</CodeGroup>

#### Update data row priority

<CodeGroup>
  ```python Python (update LPOs) theme={null}
  # Update LPOs

  # With global keys
  global_keys = ["global_key1", "global_key2"]
  project.update_data_row_labeling_priority(data_rows=lb.GlobalKeys(global_keys), priority=1)

  # With data row ids
  # data_row_ids = ["clw7jlmav35yn0768xrpawwrc", "clw7jlmav35yo0768a5amfztu"]
  # project.update_data_row_labeling_priority(data_rows=lb.UniqueIds(data_row_ids), priority=1)

  # With data row objects
  # data_rows = [data_row_1, data_row_2]
  # project.update_data_row_labeling_priority(data_rows=data_rows, priority=1)

  # Check results
  project_lpos = list(project.labeling_parameter_overrides())

  for lpo in project_lpos:
    print(lpo)
  ```
</CodeGroup>

### Add project tags

<CodeGroup>
  ```python Python theme={null}
  tags = project.update_project_resource_tags(["<project_tag_id>", "<project_tag_id>"])
  ```
</CodeGroup>

### Get project tags

<CodeGroup>
  ```python Python theme={null}
  tags = project.get_resource_tags()
  ```
</CodeGroup>

The `tags` variable is a list where each element is an object of type `ResourceTag` with the attributes, `uid`, `color`(ex: "008856") and`text`.

### Get the project overview

With `project.get_overview(details)`you can obtain some of the data from the Project Overview tab.

#### Output

The boolean parameter `details` will change the output to display the distribution of data rows between the queues.

When `details` is to false:

| Attribute         | Description                                    | Name in the Overview tab |
| ----------------- | ---------------------------------------------- | ------------------------ |
| `to_label`        | Number of data rows that are yet to be labeled | To Label                 |
| `in_review`       | Number of data rows to be reviewed             | In Review                |
| `in_rework`       | Number of data rows to be reworked             | In Rework                |
| `skipped`         | Number of skipped data rows                    | Skipped                  |
| `done`            | Number of data rows marked as Done             | Done                     |
| `issues`          | Number of data rows with associated issues     | Issues                   |
| `labeled`         | Number of data rows with one or more labels    | -                        |
| `total_data_rows` | Total number of data rows in the project       | -                        |

When `details` is set to true, the output will be the same as before, except for the following:

| Attribute   | Description                                                                   |
| ----------- | ----------------------------------------------------------------------------- |
| `in_review` | `data`: List of task queues in review with the associated number of data rows |
|             | `total`: Number of data rows to be reviewed                                   |
| `in_rework` | `data`: List of task queues in rework with the associated number of data rows |
|             | `total`: Number of data rows to be reworked                                   |

#### Equivalences

The following are equal:

| Attribute                  | Sum of attributes                                                             |
| -------------------------- | ----------------------------------------------------------------------------- |
| `overview.labeled`         | `overview.in_review + overview.in_rework + overview.done`                     |
| `overview.total_data_rows` | `overview.to_label + overview.in_review + overview.in_rework + overview.done` |

#### Example project overview

<CodeGroup>
  ```python Python theme={null}
  # Example without details
  overview = project.get_overview()

  # Selection of some attributes
  print(f"""
  To label:\t{overview.to_label / overview.total_data_rows:.2%}
  Labeled:\t{overview.labeled / overview.total_data_rows:.2%}
  """)

  # To label: 18.37%
  # Labeled: 81.63%

  # Example with details
  detailed_overview = project.get_overview(details=True)

  # Task queues in review
  print(f"""
  Number of data rows {detailed_overview.total_data_rows},
  In review:,
  \tQueues {detailed_overview.in_review["data"]},
  \tNumber of data rows {detailed_overview.in_review["total"]},
  In rework:
  \tQueues {detailed_overview.in_rework["data"]},
  \tNumber of data rows {detailed_overview.in_rework["total"]}
  """,
  sep="\n")

  # Number of data rows 23220,
  # In review:,
  # Queues [{'Initial review task': 7}],
  # Number of data rows 7,
  # In rework:
  # Queues [{'Rework (all rejected)': 1830}],
  # Number of data rows 1830
  ```
</CodeGroup>

### Manage bulk imports

Use the following methods to manage bulk imports:

* `project.get_mal_prediction_imports()` to retrieve the list of MAL import jobs.
* `project.get_label_imports()` to to retrieve the list of ground-truth import jobs.
* `MALPredictionImport.delete()` to delete a MAL import.

<Warning>
  ### Delete label imports

  The `MALPredictionImport.delete()` method can only delete MAL imports. To delete a ground-truth label, use the [Data Rows tab](/docs/data-rows-activity) on the web platform. Deleting an import is permanent and can't be undone.
</Warning>

### Export a project

For complete details, see [Export overview](/reference/export-overview#export-data-rows-from-a-project).

<CodeGroup>
  ```python Export theme={null}
    # The return type of this method is an `ExportTask`, which is a wrapper of a`Task`
    # Most of `Task` features are also present in `ExportTask`.
    export_params= {
      "attachments": True,
      "metadata_fields": True,
      "data_row_details": True,
      "project_details": True,
      "label_details": True,
      "performance_details": True,
      "interpolated_frames": True
    }

    # Note: Filters follow AND logic, so typically using one filter is sufficient.
    filters= {
      "last_activity_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"],
      "label_created_at": ["2000-01-01 00:00:00", "2050-01-01 00:00:00"],
      "workflow_status": "InReview",
      "batch_ids": ["batch_id_1", "batch_id_2"],
      "data_row_ids": ["data_row_id_1", "data_row_id_2"],
      "global_keys": ["global_key_1", "global_key_2"]
    }

    export_task = project.export(params=export_params, filters=filters)
    export_task.wait_till_done()

    # Stream the export using a callback function
    def json_stream_handler(output: labelbox.BufferedJsonConverterOutput):
      print(output.json)

    export_task.get_buffered_stream(stream_type=labelbox.StreamType.RESULT).start(stream_handler=json_stream_handler)

    # Collect all exported data into a list
    export_json = [data_row.json for data_row in export_task.get_buffered_stream()]

    print("file size: ", export_task.get_total_file_size(stream_type=lb.StreamType.RESULT))
    print("line count: ", export_task.get_total_lines(stream_type=lb.StreamType.RESULT))
  ```
</CodeGroup>

### Export issues and comments

<CodeGroup>
  ```python Python theme={null}
  import requests

  url = project.export_issues()
  issues = requests.get(url).json()

  # optionally, you can export only the open or resolved issues
  open_issues_url = project.export_issues(status="Open")
  resolved_issues_url = project.export_issues(status="Resolved")
  ```
</CodeGroup>

### Duplicate a project

See the section [Duplicate a project](/docs/create-a-project#duplicate-a-project) for the scope of this method.

<CodeGroup>
  ```python Python theme={null}
  cloned_project = project.clone()
  ```
</CodeGroup>

### Update a project

<CodeGroup>
  ```python Python theme={null}
  project.update(name="<new_project_name>")
  ```
</CodeGroup>

### Delete a project

<Danger>
  **Deleting a project cannot be undone**

  This method deletes the project along with all labels made in the project. This action cannot be reverted.
</Danger>

<CodeGroup>
  ```python Python theme={null}
  project.delete()
  ```
</CodeGroup>

## Attributes

### Get the basics

<CodeGroup>
  ```python Python theme={null}
  # name (str)
  project.name

  # description (str)
  project.description

  # updated at (datetime)
  project.updated_at

  # created at (datetime)
  project.created_at

  # last activity time (datetime)
  project.last_activity_time

  # number of required labels per consensus data row (int)
  project.auto_audit_number_of_labels

  # default percentage of consensus data rows per batch (float)
  project.auto_audit_percentage

  # created by (relationship to User object)
  user = project.created_by()

  # organization (relationship to Organization object)
  organization = project.organization()
  ```
</CodeGroup>

### Get the ontology

<CodeGroup>
  ```python Python theme={null}
  # get the ontology connected to the project (relationship to Ontology object)
  ontology = project.ontology()
  ```
</CodeGroup>

### Get the benchmarks

<CodeGroup>
  ```python Python theme={null}
  # get the benchmarks (relationship to Benchmark objects)
  benchmarks = project.benchmarks()

  # inspect one benchmark
  next(benchmarks)

  # inspect all benchmarks
  for benchmark in benchmarks:
    print(benchmark)

  # for ease of use, you can convert the paginated collection to a list
  list(benchmarks)
  ```
</CodeGroup>

### Get the webhooks

<CodeGroup>
  ```python Python theme={null}
  # get the webhooks connected to the project (relationship to Webhook objects)
  webhooks = project.webhooks()

  # inspect one webhook
  next(webhooks)

  # inspect all webhooks
  for webhook in webhooks:
    print(webhook)

  # for ease of use, you can convert the paginated collection to a list
  list(webhooks)
  ```
</CodeGroup>

### Get data row priority

Use `project.labeling_parameter_overrides` to get a list of labeling parameter overrides (LPOs), which define the priority for each label in the override list. Use `set_labeling_parameter_overrides` and `update_data_row_labeling_priority` to [modify data row priority](#set-data-row-priority).

<CodeGroup>
  ```python Python theme={null}
  # gets the LPOs created in the project (relationship to LabelingParameterOverride objects)
  lpos = project.labeling_parameter_overrides()

  # inspect one LPO
  next(lpos)

  # inspect all LPOs
  for lpo in lpos:
    print(lpo)

  # for ease of use, you can convert the paginated collection to a list
  list(lpos)

  # Get the data row id
  for lpo in lpos:
    print(lpo)
    print("Data row:", lpo.data_row().uid)
  ```
</CodeGroup>

### Get the number of labels

Use `project.get_label_count()` to return the sum of labels in the different task queues of a project.

<CodeGroup>
  ```python Python theme={null}
  # Return the number of
  project.get_label_count()
  ```
</CodeGroup>

## Copy data rows and labels

To copy our data rows and labels to a different project from a source project, use the `client.send_to_annotate_from_catalog` method with our Labelbox client.

Send to Annotate does not currently support consensus projects.

### Parameters

When you send data rows with labels to our destination project, you may choose to include or exclude certain parameters inside a Python dictionary, at a minimum, a `source_project_id` will need to be provided:

* `source_project_id`
  * The id of the project where our data rows with labels will originate.
* `annotation_ontology_mapping`
  * A dictionary containing the mapping of the source project's ontology feature schema IDs to the destination project's ontology feature schema IDs. If left empty, only the data rows with no labels will be sent to our destination project.
  * `{"<source_feature_schema_id>" : "<destination_feature_schema_id>"}`
* `exclude_data_rows_in_project`
  * Excludes data rows that are already in the project.
* `override_existing_annotations_rule`
  * The strategy defines how to handle conflicts in classifications between the data rows that already exist in the project and incoming labels from the source project.
    * Defaults to `ConflictResolutionStrategy.KeepExisting`
    * Options include:
      * `ConflictResolutionStrategy.KeepExisting`
      * `ConflictResolutionStrategy.OverrideWithPredictions`
      * `ConflictResolutionStrategy.OverrideWithAnnotations`
* `param batch_priority`
  * The priority of the batch.

<CodeGroup>
  ```python Python theme={null}
  from labelbox.schema.conflict_resolution_strategy import ConflictResolutionStrategy

  send_to_annotate_params = {
    "source_project_id": project.uid,
    "annotations_ontology_mapping": annotation_ontology_mapping, # to be defined
    "exclude_data_rows_in_project": False,
    "override_existing_annotations_rule": ConflictResolutionStrategy.OverrideWithPredictions,
    "batch_priority": 5,
  }

  # Get task id to workflow you want to send data rows. If sent to initial labeling queue, labels will be pre-labels.
  queue_id = [queue.uid for queue in destination_project.task_queues()
    if queue.queue_type == "MANUAL_REVIEW_QUEUE" ][0]

  task = client.send_to_annotate_from_catalog(
    destination_project_id=destination_project.uid,
    task_queue_id=queue_id, # ID of workflow task, set ID to None if you want to send data rows with labels to the Done queue.
    batch_name="Prediction Import Demo Batch",
    data_rows=lb.GlobalKeys(
      global_keys # Provide a list of global keys from source project
    ),
    params=send_to_annotate_params
  )

  task.wait_till_done()

  print(f"Errors: {task.errors}")
  ```
</CodeGroup>
