🐛 Describe the bug
I am using a custom get_status endpoint in my function app which returns the status of an orchestration. The trigger fetches the status using the Python SDK, does some modifications and returns it.
I have observed that the call to this endpoint sometimes never completes, it hangs for 1 hour (the configured function timeout) and restarts the instance.
@app.route("status/{instance_id}", methods=["GET"])
@app.durable_client_input(client_name="client")
@handle_client_errors
@require_auth
async def get_status(
req: func.HttpRequest,
client: durable_func.DurableOrchestrationClient,
) -> func.HttpResponse:
instance_id = req.route_params.get("instance_id")
logger.info(f"Fetching status for orchestration with ID = '{instance_id}'.")
if not instance_id:
return func.HttpResponse(
status_code=400, body=json.dumps({"error": "Instance ID is required"})
)
try:
status = await asyncio.wait_for(client.get_status(instance_id), timeout=10)
except TimeoutError as exc:
logger.error(
f"Timeout while fetching orchestration status for instance ID = '{instance_id}'.",
exc_info=exc,
)
raise HttpError(
"Timeout while fetching orchestration status",
status_code=504,
) from exc
if not status:
return func.HttpResponse(
status_code=404, body=json.dumps({"error": "Orchestration not found"})
)
logger.info("Successfully fetched orchestration status, creating response")
response = create_orchestration_status_response(status)
status_code = 202 if response["runtimeStatus"] in ["Pending", "Running"] else 200
return func.HttpResponse(
status_code=status_code,
body=json.dumps(response),
headers={"Content-Type": "application/json"},
)
In app insights I can see the 'Fetching status' log, after that nothing happens. I do observe other calls to the endpoint (for the same and different orchestrations) that do complete successfully.
🤔 Expected behavior
The get_status task should resolve or get cancelled. Maybe adding a timeout param to get_status could help?
☕ Steps to reproduce
- Create a get_status endpoint like above
- start an orchestration and return it's status query uri
- Poll the status uri
⚡If deployed to Azure
We have access to a lot of telemetry that can help with investigations. Please provide as much of the following information as you can to help us investigate!
- Timeframe issue observed: Past 2 weeks
- Orchestration instance ID(s): 8df92d4c9f174a3c9a7ba981a67120e1
🐛 Describe the bug
I am using a custom
get_statusendpoint in my function app which returns the status of an orchestration. The trigger fetches the status using the Python SDK, does some modifications and returns it.I have observed that the call to this endpoint sometimes never completes, it hangs for 1 hour (the configured function timeout) and restarts the instance.
In app insights I can see the 'Fetching status' log, after that nothing happens. I do observe other calls to the endpoint (for the same and different orchestrations) that do complete successfully.
🤔 Expected behavior
The
get_statustask should resolve or get cancelled. Maybe adding a timeout param toget_statuscould help?☕ Steps to reproduce
⚡If deployed to Azure