-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Significant slowdown in SearchClient.upload_documents(...) for large payloads #46860
Copy link
Copy link
Open
Labels
ClientThis issue points to a problem in the data-plane of the library.This issue points to a problem in the data-plane of the library.SearchService AttentionWorkflow: This issue is responsible by Azure service team.Workflow: This issue is responsible by Azure service team.bugThis issue requires a change to an existing behavior in the product in order to be resolved.This issue requires a change to an existing behavior in the product in order to be resolved.customer-reportedIssues that are reported by GitHub users external to the Azure organization.Issues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamWorkflow: This issue needs attention from Azure service team or SDK team
Metadata
Metadata
Assignees
Labels
ClientThis issue points to a problem in the data-plane of the library.This issue points to a problem in the data-plane of the library.SearchService AttentionWorkflow: This issue is responsible by Azure service team.Workflow: This issue is responsible by Azure service team.bugThis issue requires a change to an existing behavior in the product in order to be resolved.This issue requires a change to an existing behavior in the product in order to be resolved.customer-reportedIssues that are reported by GitHub users external to the Azure organization.Issues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamWorkflow: This issue needs attention from Azure service team or SDK team
Type
Projects
Status
Untriaged
Describe the bug
After upgrading
azure-search-documentsfrom11.6.0to12.0.0, we observe a significant slowdown inSearchClient.upload_documents(...)for large payloads (hundreds of docs, vector field dimension 3072).A similar issue happens when invoking
SearchClient.merge_documents(...).The regression appears client-side, before HTTP/network becomes dominant.
To Reproduce
Create two clean virtual environments:
azure-search-documents==11.6.0azure-search-documents==12.0.0Use the same Azure AI Search service and the same existing index for both runs.
content_vector) with dimension3072.Prepare a synthetic payload with large vector-heavy documents:
3072floatsRun the benchmark serially (not in parallel), alternating versions:
11.6.012.0.0For each run, measure:
SearchClient.upload_documents(documents=payload)Expected behavior
12.0.0 should not introduce a major regression compared to 11.6.0 for typical bulk upload workloads with large vector fields.
Additional context
In
12.0.0,IndexDocumentsBatch._extend_batchbuilds each action as:action_dict = {"@search.action": action_type}action_dict.update(doc)
action = IndexAction(action_dict)
This goes through model conversion/serialization paths (
Model.__init__, _create_value, _serialize) for each document and recursively for nested structures (including large vectors).Likely files/methods:
azure/search/documents/models/_patch.py (IndexDocumentsBatch._extend_batch)azure/search/documents/_utils/model_base.py (Model.__init__, _serialize)