Skip to content

FEAT normalize messages before sending#1613

Open
hannahwestra25 wants to merge 17 commits intomicrosoft:mainfrom
hannahwestra25:hawestra/normalize_send_prompt
Open

FEAT normalize messages before sending#1613
hannahwestra25 wants to merge 17 commits intomicrosoft:mainfrom
hannahwestra25:hawestra/normalize_send_prompt

Conversation

@hannahwestra25
Copy link
Copy Markdown
Contributor

Description

Utilize the Normalization Pipeline in the Target Send Path

PR 4 of the TargetConfiguration roadmap

Problem

The TargetConfiguration.normalize_async pipeline (system-squash, history-squash, etc.) was fully built in this PR but never called. Every target independently fetched conversation history, appended the current message, and sent it to the API — some with ad-hoc normalization (AzureMLChatTarget), most with none at all. This meant the centralized normalization pipeline was dead code, and normalization behavior was inconsistent across targets.

Solution

Wire the normalization pipeline into the send path so that every prompt passes through configuration.normalize_async() before reaching the target's API call. This is done by making send_prompt_async a concrete template method on PromptTarget that validates, fetches conversation from memory, runs the normalization pipeline, and delegates to a new _send_prompt_target_async abstract method for wire-format-specific logic.

Changes

  • PromptTarget.send_prompt_async: Now a concrete method that calls self.configuration.normalize_async(messages=...) and passes the result to _send_prompt_target_async
  • All 20 target subclasses: Renamed send_prompt_async_send_prompt_target_async, removed duplicated validation/memory-fetch boilerplate, now receive the pre-normalized conversation directly
  • AzureMLChatTarget: message_normalizer parameter deprecated with auto-translation to TargetConfiguration(policy={SYSTEM_PROMPT: ADAPT}); will be removed in v0.14.0

Breaking Changes

  • Target authors must override _send_prompt_target_async instead of send_prompt_async

Tests and Documentation

  • Tests: Updated all mocks/stubs to new signature; added test_normalize_async_integration.py (395 lines) covering normalize-is-called, normalized-conversation-is-used, memory-not-mutated, and legacy deprecation paths

wip: running integration tests

Comment thread pyrit/prompt_target/azure_ml_chat_target.py
Copy link
Copy Markdown
Contributor

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need tests for this?

Nvm didn't render first time I looked!

Comment thread pyrit/prompt_target/azure_blob_storage_target.py Outdated
Comment thread pyrit/prompt_target/http_target/http_target.py Outdated
Comment thread pyrit/prompt_target/http_target/http_target.py Outdated
Comment thread pyrit/prompt_target/openai/openai_response_target.py
Comment thread pyrit/prompt_target/common/prompt_target.py
Comment thread pyrit/prompt_target/openai/openai_realtime_target.py Outdated
Comment thread pyrit/prompt_target/openai/openai_response_target.py Outdated
@hannahwestra25 hannahwestra25 marked this pull request as ready for review April 16, 2026 15:15
Comment thread pyrit/prompt_target/common/prompt_target.py Outdated
"""
if not message.message_pieces:
raise ValueError("Message must contain at least one message piece. Received: 0 pieces.")
normalized_conversation = await self._get_normalized_conversation_async(message=message)
Copy link
Copy Markdown
Contributor

@rlundeen2 rlundeen2 Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a bug here. A nasty one if I understand it correctly.

_get_normalized_conversation can create new messages. As an example, HistorySquashNormalizer creates a new message, and it has a new conversation_id, drops labels, new attack_identifier, etc. Targets then use his to construct the response, and so the response inherits the garbage metadata. The implication is the response is not added to memory as part of the conversation, and we lose other stuff also.

One fix might be in _get_normalized_conversation_async if we re-stamp all the original message metadata.

     if normalized:
         self._stamp_lineage(source=message, target_message=normalized[-1])

Where stamp_lineage copies all the metadata onto every piece in the target message.

Either way, can we have a test with a conversation, run a normalizer that changes it (like squash_messages) and verify that the conversation_id and metadata of the response is accurate? It might be good to have this test first to verify the bug exists. And then run the test again to verify the fix

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh yes so I added a few tests to test_prompt_target for this scenario (which initially repro'd the issue), and added a propagate_lineage function. One thing is that right now all the normalizers update the last message so I'm only updating that with the lineage data. That's technically not an invariant of normalizers--to only produce one message--so if a user (or us) created a normalizer that produced multiple messages, the garbage metadata would still exist on all but the last message. I think to fix that we'd have to have some way of distinguishing what are the new normalized messages vs the rest of the conversation. I'm not sure how much of an issue this could be and figured we could just propagate onto the last message for now, but curious if you have thoughts

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it! Also good job seeing the edge case.

But I could also see a message_normalizer that does split things to multiple messages... We may want a defense in depth that logger.warns if the message_normalizer result has an increased number of conversations. And maybe we still stamp the conversation_id so it's at least associated.

WDYT?

Comment thread pyrit/prompt_target/openai/openai_realtime_target.py Outdated
Copy link
Copy Markdown
Contributor

@rlundeen2 rlundeen2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but have one suggestion for insidious bug; not a blocker but worth thinking through

@rlundeen2 rlundeen2 self-assigned this Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants