Skip to content

Move state machine from profiler into async handler#1352

Open
ryanzhang22 wants to merge 2 commits intopytorch:mainfrom
ryanzhang22:export-D93475197
Open

Move state machine from profiler into async handler#1352
ryanzhang22 wants to merge 2 commits intopytorch:mainfrom
ryanzhang22:export-D93475197

Conversation

@ryanzhang22
Copy link
Copy Markdown
Contributor

Summary:
Part 2 of a refactor to improve safety and readability around sync/async profiling.

Sync profiling currently only uses the profiler state machine to check whether isActive() is true. However that doesn't tell us if sync or async is the one that actually owns the profiler at that point, which can lead to edge cases when we make decisions about whether or not to proceed based on the profiler state. See the example below.

Since the state machine behavior is only relevant for async tracing, we move the related functions from the GenericActivityProfiler into the Async handler. The GenericActivityProfiler just becomes a set of functions used to drive profiling, without needing to worry about transitions. Each handler owns their own isActive() state, and ActivityProfilerController makes decisions about how to handle scheduling between sync and async. In particular, we add a cancel() function both handlers so the controller can make sure both handlers are in a good state before preempting. For the async trace, cancel() will also clear the async config and move the state back to WaitForRequest.

This change makes ownership explicit, which should hopefully make reasoning about preemptions from the controller perspective easier.

Differential Revision: D93475197

…h#1269)

Summary:

Part 1 of a refactor to improve safety and readability around sync/async profiling.

This is a straightforward (no-op) split that creates Sync/AsyncActivityProfilerHandler, which contains the methods which belonged to the ActivityProfilerController but are only used for one type of tracing. The original methods in the controller become thin wrappers that forward the request to the appropriate handler.

Differential Revision: D92550422
Summary:
Part 2 of a refactor to improve safety and readability around sync/async profiling.

Sync profiling currently only uses the profiler state machine to check whether `isActive()` is true. However that doesn't tell us if sync or async is the one that actually owns the profiler at that point, which can lead to edge cases when we make decisions about whether or not to proceed based on the profiler state. See the example below.

Since the state machine behavior is only relevant for async tracing, we move the related functions from the `GenericActivityProfiler` into the Async handler. The `GenericActivityProfiler` just becomes a set of functions used to drive profiling, without needing to worry about transitions. Each handler owns their own `isActive()` state, and `ActivityProfilerController` makes decisions about how to handle scheduling between sync and async. In particular, we add a `cancel()` function both handlers so the controller can make sure both handlers are in a good state before preempting. For the async trace, `cancel()` will also clear the async config and move the state back to `WaitForRequest`.

This change makes ownership explicit, which should hopefully make reasoning about preemptions from the controller perspective easier.

Differential Revision: D93475197
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Apr 7, 2026

@ryanzhang22 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D93475197.

ryanzhang22 added a commit to ryanzhang22/kineto that referenced this pull request Apr 9, 2026
Summary:

Part 2 of a refactor to improve safety and readability around sync/async profiling.

Sync profiling currently only uses the profiler state machine to check whether `isActive()` is true. However that doesn't tell us if sync or async is the one that actually owns the profiler at that point, which can lead to edge cases when we make decisions about whether or not to proceed based on the profiler state. See the example below.

Since the state machine behavior is only relevant for async tracing, we move the related functions from the `GenericActivityProfiler` into the Async handler. The `GenericActivityProfiler` just becomes a set of functions used to drive profiling, without needing to worry about transitions. Each handler owns their own `isActive()` state, and `ActivityProfilerController` makes decisions about how to handle scheduling between sync and async. In particular, we add a `cancel()` function both handlers so the controller can make sure both handlers are in a good state before preempting. For the async trace, `cancel()` will also clear the async config and move the state back to `WaitForRequest`.

This change makes ownership explicit, which should hopefully make reasoning about preemptions from the controller perspective easier.

Differential Revision: D93475197
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant