refactor: make EnvAdapter.reflect a shared default (fixes dropped reflect kwargs)#44
Open
imshunsuke wants to merge 1 commit into
Open
refactor: make EnvAdapter.reflect a shared default (fixes dropped reflect kwargs)#44imshunsuke wants to merge 1 commit into
imshunsuke wants to merge 1 commit into
Conversation
…lect kwargs) All six adapters duplicated an identical reflect() that delegates to run_minibatch_reflect. The copies had drifted: OfficeQA/DocVQA silently dropped meta_skill_context and ALFWorld dropped update_mode, so those analysts ran without inputs every other benchmark receives (active under the default use_meta_skill: true). Move the delegation into EnvAdapter.reflect as one default that forwards all kwargs uniformly, and delete the six overrides. reflect is no longer abstract — adapters inherit it and override only for custom logic. Net -225 lines. Behavior change: OfficeQA/DocVQA/ALFWorld reflect now receive the kwargs they previously dropped; the three already-correct benchmarks are unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
|
@imshunsuke please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
All six adapters duplicated an identical
reflect()that delegates torun_minibatch_reflect. The copies had drifted: OfficeQA/DocVQA silentlydropped
meta_skill_contextand ALFWorld droppedupdate_mode, so thoseanalysts ran without inputs every other benchmark receives (active under the
default
use_meta_skill: true).This moves the delegation into
EnvAdapter.reflectas one default thatforwards all kwargs uniformly, and deletes the six overrides.
reflectis nolonger abstract — adapters inherit it and override only for custom logic.
Net −225 lines. Behavior change: OfficeQA/DocVQA/ALFWorld reflect now receive
the kwargs they previously dropped; the three already-correct benchmarks are
unaffected.
🤖 Generated with Claude Code