Code and grammar assistance

Learn how to use in-editor tools like AI critic and code runner to identify grammar errors and validate code in multimodal chat evaluation and prompt-response projects.

Labelbox offers in-editor assistance tools to help identify and fix grammar and code issues for AI model evaluation projects, including multimodal chat evaluation and prompt and response generation. Currently, the following tools are available:

  • AI critic: Detects code and grammar issues and suggests improvements for your input.
  • Code runner: Allows you to run and test code in both model and labeler responses.

AI critic

AI critic provides in-editor AI assistance that helps identify grammar issues and code inconsistencies. It can assist in the following example areas:

  • Refining code style and consistency: For tasks requiring style adjustments, AI critic identifies areas to enhance consistency and clarity, ensuring the code adheres to specified style guidelines without changing its logic.
  • Ensuring dependency clarity: In projects where an LLM generates code that depends on non-default packages or libraries, AI critic reviews your response to verify that the code can compile and run without missing dependencies or setup issues.
  • Evaluating documentation quality: When reviewing code with docstrings (comments describing a function's purpose), AI critic assesses whether the docstring is clear and accurately represents the expected outcomes shown in unit tests.
  • Checking grammar: For natural language tasks that don’t involve code, AI critic helps identify and correct grammar issues.

Both multimodal chat editor and prompt-response generation editor automatically apply AI suggestions when you add or edit prompt and response fields. AI critic is available for all workflow tasks, including labeling, reviewing, and reworking.

Use AI critic

To use AI critic for your multimodal chat or prompt and response generation projects:

  1. In the project Overview > Workflow tasks, click Start labeling, Start reviewing, or Start reworking.
  2. Depending on your project type, add a prompt and response. In the Markdown editor, AI critic automatically generates suggestions for your input.88
  3. Click each AI suggestion to review the associated comments explaining it. Then, click PREVIEW, APPLY, or DISCARD to manage the suggested edit.

When you make changes in the Markdown editor, AI critic runs automatically and provides real-time suggestions. You can also click the refresh button to re-generate AI critic comments.

Example AI critic comments

Example AI critic suggestions

Code runner

Code runner lets you run and test code in model-generated and labeler-written responses for live and offline multimodal chat projects. It detects the programming language, sends the code to the appropriate runtime environment, and provides results in an easily analyzable format, including standard output (stdout), standard error (stderr), execution time, and runtime warnings or errors. The environment and related resources are cleared after execution to ensure security and performance.

Currently, the in-editor code runner supports the following programming languages:

  • Python
  • Javascript

📘

Beta feature

Code runner is a beta feature.

Use code runner

To use code runner for your code-related multimodal chat projects:

  1. In the project Overview > Workflow tasks, click Start labeling.
  2. Add a prompt and a response.
  3. In both the Model response field and Write your own response Markdown editor, code runner automatically detects the runtime based on your code language. You can select a different runtime from the dropdown if needed. Click RUN CODE to execute model-generated code.
  4. View the Outputs field at the bottom of the response, which displays code execution results or errors.
  5. Identify model errors and troubleshoot your code using the results from the Outputs field.
Example code runner outputs

Example code runner outputs