Skip to content

fix: use UTF-8 encoding for subprocess output on Windows#90

Open
taipoweredpm wants to merge 2 commits intomicrosoft:mainfrom
taipoweredpm:fix/windows-unicode-encoding
Open

fix: use UTF-8 encoding for subprocess output on Windows#90
taipoweredpm wants to merge 2 commits intomicrosoft:mainfrom
taipoweredpm:fix/windows-unicode-encoding

Conversation

@taipoweredpm
Copy link
Copy Markdown
Contributor

Summary

Fixes #89 — Unicode characters (emoji, em-dash, etc.) are garbled on Windows when passed between agents or posted via tools.

Root Cause

Python defaults to cp1252 (charmap) encoding on Windows for subprocess pipes and text=True mode. This mangles non-ASCII characters in script output and subprocess calls.

Changes

  • executor/script.py: Set PYTHONUTF8=1 in subprocess env so child Python processes use UTF-8
  • cli/update.py: Added encoding="utf-8" to subprocess.run()
  • mcp_auth.py: Added encoding="utf-8" to subprocess.run()

Testing

All 75 existing tests pass. Lint clean.

On Windows, Python defaults to cp1252 (charmap) encoding for subprocess
pipes and text=True mode. This causes Unicode characters (emoji, em-dash,
etc.) to be garbled when passed between agents or posted via tools.

Changes:
- script.py: Set PYTHONUTF8=1 in subprocess environment so child Python
  processes use UTF-8 instead of system default encoding
- update.py: Add encoding='utf-8' to subprocess.run call
- mcp_auth.py: Add encoding='utf-8' to subprocess.run call

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@taipoweredpm
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree company="Microsoft"

Copy link
Copy Markdown
Collaborator

@jrob5756 jrob5756 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment.

Comment on lines +95 to +99
# Always set PYTHONUTF8=1 so child Python processes use UTF-8 encoding
# instead of the system default (cp1252 on Windows), preventing garbled
# Unicode characters in script output.
base_env = {**os.environ, "PYTHONUTF8": "1"}
env = {**base_env, **agent.env} if agent.env else base_env
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

script.py fixes the child side (env var), while update.py and mcp_auth.py only fix the parent side (encoding= param). For Python-based child CLIs, both sides should be addressed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - addressed in e17ac2f. Both update.py and mcp_auth.py now set PYTHONUTF8=1 in the subprocess env alongside the existing encoding="utf-8" param, matching the approach in script.py. All three call sites are now consistent with both parent-side and child-side encoding fixes.

Address review feedback: update.py and mcp_auth.py now set PYTHONUTF8=1
in the child process environment, matching the approach in script.py.
This ensures both parent-side (encoding='utf-8') and child-side
(PYTHONUTF8=1) encoding fixes are applied consistently across all
subprocess call sites.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Windows: Unicode characters in agent output are garbled (charmap encoding)

3 participants