Feat: Add base64 image support for results. by d42me · Pull Request #885 · PrimeIntellect-ai/verifiers

d42me · 2026-02-10T05:28:06Z

Description

Add base64 image store support for results

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Test improvement

Testing

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Note

Medium Risk
Touches the core output-serialization and evaluation plumbing (including worker RPC types), so regressions could affect saved datasets or server-mode evaluations; changes are guarded by explicit modes and covered by new tests.

Overview
Adds an opt-in path to persist images in saved eval results by extracting data:image/*;base64,... payloads into a per-message images field while still rendering [image] placeholders in content.

Plumbs save_image_mode/image_mode and max_image_base64_chars through eval CLI/config, Environment/EnvGroup generation, and worker client/server request types; metadata now records save_image_mode, and saving defaults to placeholder when save_results is off.

Hardens message serialization by validating data-URI base64 and size limits, and updates sanitize_tool_calls to preserve already-serialized tool-call strings and other message fields; new tests cover base64 extraction, limit enforcement, CLI saving, and sanitization behavior.

^{Written by Cursor Bugbot for commit a36c44d. This will update automatically on new commits. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-10T05:39:44Z

verifiers/envs/environment.py

        results_path: Path | None = None,
        state_columns: list[str] | None = None,
        save_results: bool = False,
+        image_mode: str = ImageMode.BASE64.value,


Default image mode breaks existing non-data-URI callers

Medium Severity

The public API methods generate, evaluate, evaluate_sync, run_rollout, and run_group all default image_mode to ImageMode.BASE64, while the lower-level utilities state_to_output and states_to_outputs default to ImageMode.PLACEHOLDER. The BASE64 default means existing callers — like verifiers/gepa/adapter.py which calls generate() without image_mode — will now fail with a ValueError if any prompt contains an image_url with an HTTPS URL (not a data URI), since _extract_data_uri_base64 requires data: URIs. Before this change, all images were silently replaced with [image]. The eval CLI path correctly overrides this to PLACEHOLDER when save_results is false, but the public Python API does not.

Additional Locations (2)

verifiers/envs/environment.py#L848-L849

verifiers/utils/save_utils.py#L143-L144

@willccbb What do you think here? Can we introduce this breaking change for better DX? Or should we stay with the default placeholder?

Add base64 image support for results.

a36c44d

d42me requested a review from hallerite February 10, 2026 05:28

cursor bot reviewed Feb 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Add base64 image support for results.#885

Feat: Add base64 image support for results.#885
d42me wants to merge 1 commit intoPrimeIntellect-ai:mainfrom
d42me:feature/results-base64-image-support

d42me commented Feb 10, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Feb 10, 2026

Uh oh!

d42me Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

d42me commented Feb 10, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Checklist

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 10, 2026

Choose a reason for hiding this comment

Default image mode breaks existing non-data-URI callers

Uh oh!

d42me Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

d42me commented Feb 10, 2026 •

edited by cursor bot

Loading