Conversation
|
Can we do the similar change for SFT? |
f3cf026 to
e176cf9
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
a9659e3 to
ffa2895
Compare
Yes done now |
src/maxtext/configs/pyconfig.py
Outdated
| new_value = "" | ||
|
|
||
| if key == "tokenizer_path" and new_value is None: | ||
| new_value = HF_IDS.get(raw_keys["model_name"]) |
There was a problem hiding this comment.
What if MODEL_NAME is not present in HF_IDS?
| "llama2-13b", | ||
| "llama2-70b", | ||
| "llama3-8b", | ||
| "llama3.1-8b-Instruct", |
There was a problem hiding this comment.
Instead of explicitly adding Instruct model name to this list, can we somehow derive it from the base model? In future, we might need Instruct model for gemma and then we would have to update this list.
There was a problem hiding this comment.
You have added qwen3-omni-30b-a3b-Instruct to HF_IDS, but it is not present in ModelName list.
There was a problem hiding this comment.
Yes, I think the idea for this PR is to allow on the fly tokenizer_path calculation for the popular models used in the tutorials to make it easier for first time users. We still have the option of passing tokenizer_path. So, for the models which are not supported for this on the fly calculations we will default to using the previous maxtext/assets/tokenizers/tokenizer.llama2 default and if the user intended to use something else they would pass tokenizer_path
Great point about qwen3-omni-30b-a3b-Instruct, I will remove that change.
4c0f0ec to
ba2ad97
Compare
ba2ad97 to
2759471
Compare
Description
Simplify the parameters for running on MaxText by removing
tokenizer_pathas a required argument.FIXES: b/490520651
Notice 1: Once all tests pass, the "pull ready" label will automatically be assigned.
This label is used for administrative purposes. Please do not add it manually.
Notice 2: For external contributions, our settings currently require an approval from a MaxText maintainer to trigger CI tests.
Tests
Ran locally on a v5p-8 the following commands:
Checklist
Before submitting this PR, please make sure (put X in square brackets):
gemini-reviewlabel.