-
Notifications
You must be signed in to change notification settings - Fork 4.4k
[WIP] Ray Data + Ray Train tutorial #3763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3763
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| # a high loss (~10–11). | ||
|
|
||
| ############################################################################### | ||
| # Checkpointing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debating whether I should slim this section down, given that I'm not checkpointing in the tutorial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this section and point to the checkpointing user guide in Ray Docs. That way we can direct people to new features and avoid showing outdated apis (ex: the TorchTrainer.restore() API is deprecated).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense @justinvyu . I rewrote it without code, but kept the section because I wanted to discuss some of the nice checkpointing features. Lmk what you think.
| # a high loss (~10–11). | ||
|
|
||
| ############################################################################### | ||
| # Checkpointing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this section and point to the checkpointing user guide in Ray Docs. That way we can direct people to new features and avoid showing outdated apis (ex: the TorchTrainer.restore() API is deprecated).
| ray.train.report( | ||
| metrics=metrics, | ||
| checkpoint=None, # If we were checkpointing, we'd pass a Checkpoint here | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can just print if we're not checkpointing to simplify the script. these metrics don't get saved anywhere automatically. the user should report to wandb or something if they want to keep these.
Co-authored-by: Justin Yu <justinvyu@anyscale.com>
Co-authored-by: Justin Yu <justinvyu@anyscale.com>
Adds a new tutorial showcasing streaming data loading using Ray Data with Ray Train. Will move out of draft once I get reviews from the Data and Train teams.
cc @pcmoritz @robertnishihara @matthewdeng @richardliaw @akshay-anyscale