Arm backend: Split MLSDK setup flows and handle shaderFloat64 workaround#18942
Arm backend: Split MLSDK setup flows and handle shaderFloat64 workaround#18942perheld wants to merge 1 commit intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18942
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 18 New Failures, 3 Unrelated FailuresAs of commit 27991ae with merge base a43675c ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Keep examples/arm/setup.sh focused on the pip-installed MLSDK flow and move source builds into the dedicated setup-mlsdk-from-source.sh script. Retain the old MLSDK source-build-related setup.sh options as deprecated compatibility shims that guide users to the dedicated source-build flow. When the Vulkan ICD reports shaderFloat64 = false, the pip-installed emulation layer is no longer accepted for that setup because it cannot be configured with the required workaround. In that case, setup.sh now points users to the source-build script, and the source-build script builds the emulation layer with VMEL_USE_FLOAT_AS_DOUBLE=ON. Also simplify pip-based emulation-layer environment setup by deriving paths from the installed package layout instead of scraping shell export output from the emulation_layer helper. Update the Arm README to document the split between the default pip workflow and the advanced source-build workflow. Change-Id: I4bebe171ee3b3728fb126c9d00d73881cb332253 Signed-off-by: Per Held <per.held@arm.com>
There was a problem hiding this comment.
Pull request overview
This PR restructures the Arm backend setup flows by making examples/arm/setup.sh focus on the default pip-installed MLSDK path while moving MLSDK source builds (including the shaderFloat64 workaround build flag) into a new dedicated setup-mlsdk-from-source.sh script.
Changes:
- Split MLSDK setup into a default pip workflow (
examples/arm/setup.sh) and a dedicated source-build workflow (backends/arm/scripts/setup-mlsdk-from-source.sh), with deprecated shims insetup.sh. - Add a shaderFloat64 compatibility probe for the pip flow and enable
VMEL_USE_FLOAT_AS_DOUBLEwhen building the emulation layer from source. - Simplify pip emulation-layer env setup by deriving paths from the installed Python package layout; update Arm README to document the split.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| examples/arm/setup.sh | Removes source-build flow, adds compatibility checks and deprecated option handling for MLSDK setup via pip. |
| backends/arm/scripts/vulkan_utils.sh | Removes unused MLSDK manifest dir variable from Vulkan utilities. |
| backends/arm/scripts/setup-mlsdk-from-source.sh | New script to build MLSDK components from source and apply the shaderFloat64 workaround via CMake flag. |
| backends/arm/scripts/mlsdk_utils.sh | Drops source-build helpers and adds vulkaninfo probing + pip package layout–based emulation layer environment setup. |
| backends/arm/README.md | Documents the default pip workflow and the advanced source-build workflow entrypoint. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| setup_vulkan_sdk | ||
| fi | ||
|
|
||
| setup_mlsdk_source "${mlsdk_manifest_dir}" |
There was a problem hiding this comment.
setup_mlsdk_source is invoked unconditionally, so the script will sync the MLSDK manifest even when the user only requests --enable-vulkan-sdk (or disables all MLSDK components). Consider guarding this call so the manifest sync/build only runs when at least one of model-converter/vgf-lib/emulation-layer is enabled.
| setup_mlsdk_source "${mlsdk_manifest_dir}" | |
| if [[ "${enable_model_converter}" -eq 1 || "${enable_vgf_lib}" -eq 1 || "${enable_emulation_layer}" -eq 1 ]]; then | |
| setup_mlsdk_source "${mlsdk_manifest_dir}" | |
| fi |
| if [[ ! -f repo ]]; then | ||
| log_step "mlsdk" "Fetching repo tool" | ||
| curl https://storage.googleapis.com/git-repo-downloads/repo > repo | ||
| chmod u+x repo | ||
| fi | ||
|
|
||
| ./repo init \ | ||
| --depth=1 \ | ||
| --no-repo-verify \ | ||
| --manifest-url "${mlsdk_manifest_url}" \ | ||
| --manifest-branch "${mlsdk_manifest_tag}" \ | ||
| -g model-converter,emulation-layer,vgf-library | ||
|
|
||
| ./repo sync --force-sync -j"${parallel_jobs}" |
There was a problem hiding this comment.
The script downloads the repo tool over the network without integrity verification and then runs repo init with --no-repo-verify, which disables signature checks. This is a supply-chain risk for anyone running the setup script; consider pinning and verifying a checksum/signature for the downloaded repo binary and avoiding --no-repo-verify unless there’s a documented, unavoidable reason.
| if [[ "${cached_tag}" == "refs/tags/v2025.10.0" && \ | ||
| "${mlsdk_manifest_tag}" == "refs/tags/v2025.12.0" ]]; then | ||
| pushd "${manifest_dir}/.." >/dev/null || return 1 | ||
| log_step "mlsdk" "Deleting ${mlsdk_manifest_dir} and starting fresh" |
There was a problem hiding this comment.
In the v2025.10.0 -> v2025.12.0 migration workaround, the log message says it is deleting ${mlsdk_manifest_dir}, but the code actually deletes $(basename "${manifest_dir}") under $(dirname "${manifest_dir}"). If the user passed a custom --manifest-dir, this message can be inaccurate; consider logging the actual manifest_dir (or its basename) being removed.
| log_step "mlsdk" "Deleting ${mlsdk_manifest_dir} and starting fresh" | |
| log_step "mlsdk" "Deleting $(basename "${manifest_dir}") and starting fresh" |
| --mlsdk-manifest-url) | ||
| if [[ $# -lt 2 ]]; then | ||
| print_usage "$@" | ||
| exit 1 | ||
| fi | ||
| log_step "mlsdk" \ | ||
| "Deprecated option '--mlsdk-manifest-url' selected; use it with ./backends/arm/scripts/setup-mlsdk-from-source.sh instead" | ||
| shift 2 | ||
| ;; | ||
| --mlsdk-manifest-tag) | ||
| if [[ $# -lt 2 ]]; then | ||
| print_usage "$@" | ||
| exit 1 | ||
| fi | ||
| log_step "mlsdk" \ | ||
| "Deprecated option '--mlsdk-manifest-tag' selected; use it with ./backends/arm/scripts/setup-mlsdk-from-source.sh instead" | ||
| shift 2 |
There was a problem hiding this comment.
The deprecated flags --mlsdk-manifest-url / --mlsdk-manifest-tag are accepted but then ignored (only a log + shift 2). This can silently fall back to the pip flow even though the user explicitly tried to influence the source-build manifest. Consider exiting non-zero (similar to --install-mlsdk-deps-from-src) after printing the guidance, so callers don’t think the flag took effect.
| echo "[mlsdk_utils] Failed to query emulation_layer environment; skipping" | ||
| } |
There was a problem hiding this comment.
The fallback error message is now misleading: this code no longer queries the emulation_layer CLI output/environment, it locates the Python package directory and checks for a deploy/ folder. Update the message (and ideally write it to stderr) so failures are actionable (e.g., package not installed / no deploy dir found).
Keep examples/arm/setup.sh focused on the pip-installed MLSDK flow and move source builds into the dedicated setup-mlsdk-from-source.sh script. Retain the old MLSDK source-build-related setup.sh options as deprecated compatibility shims that guide users to the dedicated source-build flow.
When the Vulkan ICD reports shaderFloat64 = false, the pip-installed emulation layer is no longer accepted for that setup because it cannot be configured with the required workaround. In that case, setup.sh now points users to the source-build script, and the source-build script builds the emulation layer with VMEL_USE_FLOAT_AS_DOUBLE=ON.
Also simplify pip-based emulation-layer environment setup by deriving paths from the installed package layout instead of scraping shell export output from the emulation_layer helper.
Update the Arm README to document the split between the default pip workflow and the advanced source-build workflow.
Change-Id: I4bebe171ee3b3728fb126c9d00d73881cb332253
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell