Add RL unit tests: normalize and match_approximately by hengtaoguo · Pull Request #3379 · AI-Hypercomputer/maxtext

hengtaoguo · 2026-03-11T18:39:31Z

Description

Add two unit tests for RL utils:

normalize_final_answer: Various numeric/string formats ("1,000", "$100", "\boxed{100}", etc.)
match_format_approximately: Test reward/penalty logic with 0-4 matching tokens

Tests

CI

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

codecov · 2026-03-11T18:43:48Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

github-actions · 2026-03-11T18:58:38Z

🤖 Hi @hengtaoguo, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

github-actions

## 📋 Review Summary

This PR successfully adds comprehensive unit tests for normalize_final_answer and match_format_approximately in the RL utilities. The added tests correctly evaluate formatting rewards and penalties as well as string normalization edge cases, contributing to better test coverage for RL post-training functionality.

🔍 General Feedback

Positive Highlight: The test coverage for normalize_final_answer correctly anticipates and verifies several tricky LaTeX edge cases such as wrapper extraction and shorthand expansions.
Testing Pattern: Consider breaking down monolithic test methods with multiple assertions (as seen in test_partial_format_scores and test_normalize_final_answer) into smaller, isolated test cases to improve diagnosability.
The mixing of unittest.TestCase with @pytest.mark.cpu_only remains consistent with the rest of the testing suite in this file, so no immediate action is needed on that front.

tests/unit/rl_utils_test.py

richjames0

lgtm

hengtaoguo force-pushed the hengtaoguo-rl branch from da0bda8 to 1e9ab7e Compare March 11, 2026 18:50

hengtaoguo marked this pull request as ready for review March 11, 2026 18:51

hengtaoguo requested review from A9isha, NicoGrande, NuojCheng, RissyRan, SurbhiJainUSC, aireenmei, bvandermoon, dipannita08, gagika, gobbleturk, igorts-git, jesselu-google, jiangjy1982, khatwanimohit, richjames0, shralex, suexu1025 and vipannalla as code owners March 11, 2026 18:51

hengtaoguo added the gemini-review label Mar 11, 2026

github-actions bot reviewed Mar 11, 2026

View reviewed changes

tests/unit/rl_utils_test.py Show resolved Hide resolved

hengtaoguo force-pushed the hengtaoguo-rl branch from f8ebd41 to 3daa122 Compare March 11, 2026 19:07

Add RL unit tests: normalize and match_approximately

241383e

hengtaoguo force-pushed the hengtaoguo-rl branch from ab47c30 to 241383e Compare March 11, 2026 19:13

richjames0 approved these changes Mar 11, 2026

View reviewed changes

xuefgu approved these changes Mar 11, 2026

View reviewed changes

hengtaoguo added the pull ready label Mar 11, 2026

copybara-service bot merged commit 8e0aaf5 into main Mar 11, 2026
54 of 56 checks passed

copybara-service bot deleted the hengtaoguo-rl branch March 11, 2026 21:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RL unit tests: normalize and match_approximately#3379

Add RL unit tests: normalize and match_approximately#3379
copybara-service[bot] merged 1 commit intomainfrom
hengtaoguo-rl

hengtaoguo commented Mar 11, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 11, 2026

Uh oh!

github-actions bot commented Mar 11, 2026

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

richjames0 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hengtaoguo commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

codecov bot commented Mar 11, 2026

Codecov Report

Uh oh!

github-actions bot commented Mar 11, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

🔍 General Feedback

Uh oh!

Uh oh!

richjames0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hengtaoguo commented Mar 11, 2026 •

edited

Loading