Skip to content

Conversation

@constantinius
Copy link
Contributor

Description

Convert messages to common gen_ai.request.messages structure

Issues

Closes https://linear.app/getsentry/issue/TET-1633/redact-images-openai-openai-agents

@linear
Copy link

linear bot commented Dec 17, 2025

@constantinius constantinius changed the title test(integrations): add test for message conversion fix(openai): convert input message format Dec 17, 2025
@constantinius constantinius marked this pull request as ready for review January 8, 2026 08:34
@constantinius constantinius requested a review from a team as a code owner January 8, 2026 08:34
@constantinius constantinius changed the title fix(openai): convert input message format fix(integrations): openai/openai-agents: convert input message format Jan 8, 2026
Base automatically changed from constantinius/fix/redact-message-parts-type-blob to master January 13, 2026 09:56
@github-actions
Copy link
Contributor

github-actions bot commented Jan 13, 2026

Semver Impact of This PR

🟢 Patch (bug fixes)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


New Features ✨

  • feat(ai): add parse_data_uri function to parse a data URI by constantinius in #5311
  • feat(asyncio): Add on-demand way to enable AsyncioIntegration by sentrivana in #5288

Bug Fixes 🐛

  • fix(ai): redact message parts content of type blob by constantinius in #5243
  • fix(clickhouse): Guard against module shadowing by alexander-alderman-webb in #5250
  • fix(gql): Revert signature change of patched gql.Client.execute by alexander-alderman-webb in #5289
  • fix(grpc): Derive interception state from channel fields by alexander-alderman-webb in #5302
  • fix(integrations): openai/openai-agents: convert input message format by constantinius in #5248
  • fix(litellm): Guard against module shadowing by alexander-alderman-webb in #5249
  • fix(pure-eval): Guard against module shadowing by alexander-alderman-webb in #5252
  • fix(ray): Guard against module shadowing by alexander-alderman-webb in #5254
  • fix(threading): Handle channels shadowing by sentrivana in #5299
  • fix(typer): Guard against module shadowing by alexander-alderman-webb in #5253
  • fix: Send client reports for span recorder overflow by sentrivana in #5310

Documentation 📚

  • docs(metrics): Remove experimental notice by alexander-alderman-webb in #5304
  • docs: Update Python versions banner in README by sentrivana in #5287

Internal Changes 🔧

Release

  • ci(release): Bump Craft version to fix issues by BYK in #5305
  • ci(release): Switch from action-prepare-release to Craft by BYK in #5290

Other

  • chore(gen_ai): add auto-enablement for google genai by shellmayr in #5295
  • chore: Add type for metric units by sentrivana in #5312
  • ci: Update tox and handle generic classifiers by sentrivana in #5306

🤖 This preview updates automatically when you update the PR.

Copy link
Contributor

@sentrivana sentrivana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok to me, two things:

  • Can we add a check to the tests that we're not modifying the user's messages? Either as a new test or just adding an assert to the tests added in this PR
  • I assume there is no way to dedupe some of the trimming logic between OpenAI agents and OpenAI because the format is different?

Comment on lines +226 to +228
if item.get("type") == "image_url":
image_url = item.get("image_url") or {}
url = image_url.get("url", "")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The code will raise an AttributeError if item.get("image_url") returns a string, because the or {} fallback is not triggered and .get() is called on a string.
Severity: HIGH

Suggested Fix

Add a check to ensure image_url is a dictionary before calling .get() on it. A similar pattern is used elsewhere in the codebase: url = image_url.get("url", "") if isinstance(image_url, dict) else str(image_url). This will handle both dictionary and string formats gracefully.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: sentry_sdk/integrations/openai.py#L226-L228

Potential issue: In the `_convert_message_parts` function, the code processes message
parts to extract an `image_url`. The line `image_url = item.get("image_url") or {}` does
not correctly handle cases where the value of `image_url` is a string instead of a
dictionary. If a string is provided (e.g., `{"type": "image_url", "image_url":
"https://..."}`), the subsequent call to `image_url.get("url", "")` will raise an
`AttributeError`, as strings do not have a `.get()` method. This causes an unhandled
exception within the Sentry integration, preventing the span from being processed
correctly.

Did we get this right? 👍 / 👎 to inform future reviews.

@constantinius
Copy link
Contributor Author

Can we add a check to the tests that we're not modifying the user's messages? Either as a new test or just adding an assert to the tests added in this PR

Done

I assume there is no way to dedupe some of the trimming logic between OpenAI agents and OpenAI because the format is different?

Looking into that. Cursor says no. But I'm not sure tbh

content = _transform_openai_agents_message_content(original_input)
if not isinstance(content, list):
content = [{"text": str(content), "type": "text"}]
messages.append({"content": content, "role": "user"})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-dict content items produce invalid message structure

Low Severity

When original_input is a list containing non-dict items (like strings or numbers), _transform_openai_agents_message_content returns them unchanged. The calling code only wraps the result in text format when it's NOT a list, so lists with non-dict items like ["hello", "world"] become invalid content structures instead of proper [{"text": "hello", "type": "text"}, ...] format. The old code used safe_serialize() to handle any input type safely, producing valid message content for all cases.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants