Skip to content

Fix AMD APU VRAM detection and formatting issues in amd.py#9

Draft
Readon wants to merge 7 commits intogpustack:mainfrom
Readon:main
Draft

Fix AMD APU VRAM detection and formatting issues in amd.py#9
Readon wants to merge 7 commits intogpustack:mainfrom
Readon:main

Conversation

@Readon
Copy link

@Readon Readon commented Mar 10, 2026

This pull request improves the detection of available memory on AMD GPUs, especially for APUs with unified memory (such as AMD Strix Halo). The changes ensure that the system reports the correct amount of usable memory by considering both VRAM and VIS_VRAM, addressing cases where traditional VRAM reporting is insufficient. Additionally, new memory type constants have been introduced for clarity and future usage.

Memory detection improvements:

  • Updated the AMD GPU detection logic in gpustack_runtime/detector/amd.py to check for VIS_VRAM memory and use it when it exceeds reported VRAM, ensuring accurate memory reporting on APUs with unified memory. [1] [2]

Constants and definitions:

  • Added new memory type constants (RSMI_MEM_TYPE_VRAM, RSMI_MEM_TYPE_GTT, RSMI_MEM_TYPE_VIS_VRAM) in gpustack_runtime/detector/pyrocmsmi/__init__.py to support the improved memory detection logic.

Copilot AI and others added 7 commits March 10, 2026 13:13
On AMD APUs with unified memory architecture (e.g., AMD Ryzen AI MAX /
Strix Halo, GC 12.0.0), both amdsmi_get_gpu_vram_usage and
rsmi_dev_memory_total_get(RSMI_MEM_TYPE_VRAM) return only the BIOS
carveout (~512 MiB) instead of the actual usable unified memory.

VIS_VRAM (RSMI_MEM_TYPE_VIS_VRAM=2) correctly reports the full system
memory on APUs because all system memory is CPU-visible. Add the
constant to pyrocmsmi and use max(VRAM, VIS_VRAM) in both detection
paths so APUs report the correct memory capacity.

Co-authored-by: Readon <3614708+Readon@users.noreply.github.com>
Fix AMD APU unified memory VRAM detection (Strix Halo / Ryzen AI MAX)
Co-authored-by: Readon <3614708+Readon@users.noreply.github.com>
Fix ruff format violations in AMD APU unified memory VRAM detection
Copilot AI review requested due to automatic review settings March 10, 2026 14:34
@Readon
Copy link
Author

Readon commented Mar 10, 2026

my problem is how can I test it? I am using the docker based deployment method. @copilot @thxCode

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves AMD GPU memory detection for APUs with unified memory by incorporating ROCm SMI’s VIS_VRAM reporting when it exceeds the traditional VRAM value, and introduces explicit ROCm SMI memory-type constants to support that logic.

Changes:

  • Prefer VIS_VRAM over VRAM for total memory reporting when VIS_VRAM is larger (APU/unified memory scenarios).
  • Add ROCm SMI memory type constants (RSMI_MEM_TYPE_VRAM/GTT/VIS_VRAM) for clearer memory queries.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
gpustack_runtime/detector/amd.py Adds VIS_VRAM-based total-memory override in both amdsmi and ROCm SMI code paths.
gpustack_runtime/detector/pyrocmsmi/__init__.py Introduces numeric memory-type constants for ROCm SMI memory queries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +188 to +192
dev_idx,
pyrocmsmi.RSMI_MEM_TYPE_VIS_VRAM,
),
)
dev_mem = max(dev_mem, dev_mem_vis_vram)
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dev_gpu_vram_usage.get("vram_total") / get("vram_used") can return None; calling max(dev_mem, dev_mem_vis_vram) will then raise TypeError. Consider defaulting to 0 (or otherwise normalizing to an int) before the max() so the VIS_VRAM fallback cannot crash detection.

Copilot uses AI. Check for mistakes.
Comment on lines +182 to +186
# On APUs with unified memory (e.g., AMD Strix Halo), VRAM
# reports only the BIOS carveout (~512 MiB); VIS_VRAM reports
# the full usable system memory. Use VIS_VRAM when larger.
with contextlib.suppress(pyrocmsmi.ROCMSMIError):
dev_mem_vis_vram = byte_to_mebibyte(
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VIS_VRAM selection logic (comment + suppress block + max) is duplicated in both the pyamdsmi and fallback branches. Consider extracting it into a small helper (or applying it once after the try/except) to avoid divergence if this logic needs future changes.

Copilot uses AI. Check for mistakes.
Comment on lines +65 to +69
## Memory Types ##
RSMI_MEM_TYPE_VRAM = 0
RSMI_MEM_TYPE_GTT = 1
RSMI_MEM_TYPE_VIS_VRAM = 2

Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New RSMI_MEM_TYPE_* constants are introduced here, but rsmi_dev_memory_total_get / rsmi_dev_memory_usage_get still default to rsmi_memory_type_t.RSMI_MEM_TYPE_VRAM. If rsmiBindings isn't importable (bindings path missing), those defaults can raise NameError even though the library can be loaded. Consider switching the defaults to these new constants (or defining these constants from rsmi_memory_type_t when available, with a numeric fallback) so memory calls work reliably across install layouts.

Suggested change
## Memory Types ##
RSMI_MEM_TYPE_VRAM = 0
RSMI_MEM_TYPE_GTT = 1
RSMI_MEM_TYPE_VIS_VRAM = 2
## Memory Types ##
# Fallback numeric values that work even when rsmiBindings cannot be imported.
RSMI_MEM_TYPE_VRAM = 0
RSMI_MEM_TYPE_GTT = 1
RSMI_MEM_TYPE_VIS_VRAM = 2
# If rsmi_memory_type_t is available from rsmiBindings, prefer the values
# defined there so we stay in sync with the underlying library.
if "rsmi_memory_type_t" in globals():
try:
RSMI_MEM_TYPE_VRAM = int(rsmi_memory_type_t.RSMI_MEM_TYPE_VRAM)
RSMI_MEM_TYPE_GTT = int(rsmi_memory_type_t.RSMI_MEM_TYPE_GTT)
RSMI_MEM_TYPE_VIS_VRAM = int(rsmi_memory_type_t.RSMI_MEM_TYPE_VIS_VRAM)
except Exception:
# If anything unexpected happens, keep using the numeric fallbacks above.
pass
else:
# When rsmiBindings is not importable, provide a minimal stand-in so that
# references like rsmi_memory_type_t.RSMI_MEM_TYPE_VRAM used in default
# arguments do not raise NameError at import time.
class rsmi_memory_type_t: # type: ignore[override]
RSMI_MEM_TYPE_VRAM = RSMI_MEM_TYPE_VRAM
RSMI_MEM_TYPE_GTT = RSMI_MEM_TYPE_GTT
RSMI_MEM_TYPE_VIS_VRAM = RSMI_MEM_TYPE_VIS_VRAM

Copilot uses AI. Check for mistakes.
Comment on lines +203 to +207
# On APUs with unified memory (e.g., AMD Strix Halo), VRAM
# reports only the BIOS carveout (~512 MiB); VIS_VRAM reports
# the full usable system memory. Use VIS_VRAM when larger.
with contextlib.suppress(pyrocmsmi.ROCMSMIError):
dev_mem_vis_vram = byte_to_mebibyte(
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new VIS_VRAM behavior isn’t covered by automated tests. Since tests/gpustack_runtime/detector/test_amd.py exists, consider adding a unit-style test that monkeypatches pyamdsmi.amdsmi_get_gpu_vram_usage and pyrocmsmi.rsmi_dev_memory_total_get to validate that VIS_VRAM is preferred when larger and that ROCm SMI errors are safely suppressed.

Copilot uses AI. Check for mistakes.
@Readon Readon marked this pull request as draft March 10, 2026 14:54
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue with inaccurate VRAM reporting on AMD APUs, especially those with unified memory architectures. It modifies the GPU detection logic to correctly identify and utilize the full usable system memory by checking VIS_VRAM in addition to traditional VRAM values. This ensures that the system provides a more precise representation of available memory resources, improving overall system monitoring and resource allocation.

Highlights

  • AMD APU VRAM Detection: Enhanced the detection of available memory on AMD GPUs, particularly for APUs with unified memory, by incorporating VIS_VRAM to ensure accurate reporting.
  • New Memory Type Constants: Introduced RSMI_MEM_TYPE_VRAM, RSMI_MEM_TYPE_GTT, and RSMI_MEM_TYPE_VIS_VRAM constants for improved clarity and future extensibility.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • gpustack_runtime/detector/amd.py
    • Modified the detect method to include VIS_VRAM in the calculation of total device memory, using the larger of VRAM and VIS_VRAM for APUs.
    • Added error suppression for ROCMSMIError when querying VIS_VRAM.
  • gpustack_runtime/detector/pyrocmsmi/init.py
    • Defined new integer constants for RSMI_MEM_TYPE_VRAM, RSMI_MEM_TYPE_GTT, and RSMI_MEM_TYPE_VIS_VRAM.
Activity
  • No human activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to improve VRAM detection for AMD APUs. I've found a critical issue with incorrect memory type constants which will cause the feature to not work as intended. Additionally, there's a code duplication issue that should be addressed to improve maintainability.

Comment on lines +67 to +68
RSMI_MEM_TYPE_GTT = 1
RSMI_MEM_TYPE_VIS_VRAM = 2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The values for RSMI_MEM_TYPE_GTT and RSMI_MEM_TYPE_VIS_VRAM appear to be swapped. According to the ROCm SMI library documentation (which is referenced in this file's comments), RSMI_MEM_TYPE_VIS_VRAM should be 1 and RSMI_MEM_TYPE_GTT should be 2. With the current values, the code will query for GTT memory instead of visible VRAM, leading to incorrect memory reporting for APUs.

Suggested change
RSMI_MEM_TYPE_GTT = 1
RSMI_MEM_TYPE_VIS_VRAM = 2
RSMI_MEM_TYPE_GTT = 2
RSMI_MEM_TYPE_VIS_VRAM = 1

Comment on lines +182 to +192
# On APUs with unified memory (e.g., AMD Strix Halo), VRAM
# reports only the BIOS carveout (~512 MiB); VIS_VRAM reports
# the full usable system memory. Use VIS_VRAM when larger.
with contextlib.suppress(pyrocmsmi.ROCMSMIError):
dev_mem_vis_vram = byte_to_mebibyte(
pyrocmsmi.rsmi_dev_memory_total_get(
dev_idx,
pyrocmsmi.RSMI_MEM_TYPE_VIS_VRAM,
),
)
dev_mem = max(dev_mem, dev_mem_vis_vram)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code for checking VIS_VRAM is duplicated on lines 203-213. This duplication can lead to maintenance issues in the future, for example, if one block is updated but the other is missed. To improve maintainability, consider refactoring this logic into a helper function or restructuring the code to perform this check only once after the initial dev_mem value has been determined.

@thxCode
Copy link
Contributor

thxCode commented Mar 11, 2026

@Readon can you address the comments generated by Gemini?

@thxCode
Copy link
Contributor

thxCode commented Mar 11, 2026

my problem is how can I test it? I am using the docker based deployment method. @copilot @thxCode

You can make package to build a gpustack/runtime image, and test with it, or install from source inside the container.

@thxCode
Copy link
Contributor

thxCode commented Mar 12, 2026

@Readon, any update?

@Readon
Copy link
Author

Readon commented Mar 12, 2026

@Readon, any update?

Sorry, I still do not have time to finish this.

And I found that if I run the latest gpustack/gpustack:v2.1.0 on strix halo, it will crash.
After I stop the container then it will not crash at all.

@Readon
Copy link
Author

Readon commented Mar 12, 2026

I need to find a time to test this on the machine.

@thxCode
Copy link
Contributor

thxCode commented Mar 12, 2026

@Readon, any update?

Sorry, I still do not have time to finish this.

And I found that if I run the latest gpustack/gpustack:v2.1.0 on strix halo, it will crash. After I stop the container then it will not crash at all.

Do you mean the host crash?

@Readon
Copy link
Author

Readon commented Mar 12, 2026

@Readon, any update?

Sorry, I still do not have time to finish this.
And I found that if I run the latest gpustack/gpustack:v2.1.0 on strix halo, it will crash. After I stop the container then it will not crash at all.

Do you mean the host crash?

yes the host crashed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants