Add API for GetWorkflowExecutionResult#778
Add API for GetWorkflowExecutionResult#778long-nt-tran wants to merge 3 commits intotemporalio:masterfrom
Conversation
c13b529 to
0cea08d
Compare
- Split enums to be more granular (NotCompleted/Success/Failure) - Move some common fields to base message - Comment regarding always getting result of latest run for a workflow_id
| Failure failure = 6; | ||
| } | ||
|
|
||
| message NotCompleted {} |
There was a problem hiding this comment.
I think empty message is ok (basically using this as an enum)? I see it used elsewhere, i.e.,
api/temporal/api/workflow/v1/message.proto
Line 460 in 8e0453c
There was a problem hiding this comment.
+1 to using an empty message instead of protobuf.Empty{} or similar. This allows us to add more data later if needed. e.g. maybe we'll want to wire in a "time_until_timeout" or "started_at" field later.
|
|
||
| // Completion status of the workflow, which carries relevant information for the caller. | ||
| oneof completion { | ||
| NotCompleted not_completed = 4; |
There was a problem hiding this comment.
What's the purpose of this field if there's already status?
I understand that there might not be a success or failure yet, I just wonder if that should be expressed as an empty/nil completion field. Not sure, maybe more an SDK question as it might impact its user-facing API impl.
There was a problem hiding this comment.
I'm using this as an "enum that carries extra info" (i.e., Success carries the result, Failure carries the failure). I was thinking that status is a more granular set of enums in case the user really wants to see the status (I.e., what kind of failure it is), although maybe this is also inspectable via the failure itself.
Leaning towards removing status field since the "enum with extra info" is a bit easier to carry extra info in success/failure cases
- Comments on each proto field - Remove status field -- this can be inferrable via the oneof completion status - Spacing and enumeration
chrsmith
left a comment
There was a problem hiding this comment.
I'll defer to Stephan for approval. But modulo some nits this LGTM!
| Failure failure = 6; | ||
| } | ||
|
|
||
| message NotCompleted {} |
There was a problem hiding this comment.
+1 to using an empty message instead of protobuf.Empty{} or similar. This allows us to add more data later if needed. e.g. maybe we'll want to wire in a "time_until_timeout" or "started_at" field later.
| rpc DeleteNexusOperationExecution (DeleteNexusOperationExecutionRequest) returns (DeleteNexusOperationExecutionResponse) {} | ||
|
|
||
| // GetWorkflowExecutionResult returns the result of a workflow execution. If the workflow | ||
| // is still running, returns a NotCompleted response and optionally registers callbacks |
There was a problem hiding this comment.
Nit: It doesn't return a "NotCompleted" response. It will always return the same GetWorkflowExecutionResultResponse, which will describe it's status which may be NotComplete.
I'd slightly argue that we can just remove the sentence "If the workflow is still running..." since it doesn't clarify anything. It will always return the result of a workflow execution.
What do you think of the following. Is this strictly better?
// GetWorkflowExecutionResult returns the state of a workflow execution, which may be still running, completed, or failed. It can optionally register new // callbacks to be invoked when Workflow reaches a terminal state. // (Adding a callback to a completed workflow will return an XXX error.)
| string identity = 4; | ||
| // Callbacks to be called by the server when this activity reaches a terminal state. | ||
| // Callback addresses must be allowlisted in the server's dynamic configuration. | ||
| repeated temporal.api.common.v1.Callback completion_callbacks = 5; |
There was a problem hiding this comment.
I don't know if this is quite right, but reserve the right to be wrong 😅
You would expect an operation like Get* to not mutate the underlying resource. (At least that's the expectation in the HTTP world.) So it seems strange that calling Get* could actually modify the workflow's state, and register new callbacks.
- Is there a particular reason we want to allow registering new callbacks? And if not, maybe we can remove this capability?
- If we do want to enable registering new callbacks to a Workflow (which seems like it'd be useful), should we instead add a separate
UpdateWorkflowExecutionoperation?
There was a problem hiding this comment.
I think from the user's POV, this request is semantically a "get" operation even if the implementation detail requires some mutation to the target's state to deliver that result. I think Nexus-mapped operation would have to attach some sort of callback to deliver result async. An example is:
- Alice uses a Nexus op in their namespace that maps to
GetWorkflowExecutionResultfor a workflow in Bob's namespace - Workflow hasn't completed. As an implementation detail, this request now attaches a "completion callback" which is essentially a URL
- Once the workflow in Bob's namespace finishes, the result will be delivered to this callback URL (for example this is its CHASM implementation which is a retryable RPC/HTTP call wrapped in a state machine)
I do agree that it can sound strange to have the "Get" modify the workflow, although it's not really modifying the workflow's state, but rather attaches a listener onto the workflow.
From the user's POV, how the result gets delivered is kinda opaque so I think Get* sounds ok here?
Add frontend API and request/response protos for getting workflow execution result.
Since this is intended to be used via Nexus, the API allows caller to attach callbacks for an ongoing workflow execution.
Why?
New API to enable Nexus ops to map more ergonomically in the handler.
Breaking changes
Not breaking, new (unused) API
Server PR
temporalio/temporal#10173