Fix AMD APU VRAM detection and formatting issues in amd.py#9
Fix AMD APU VRAM detection and formatting issues in amd.py#9Readon wants to merge 7 commits intogpustack:mainfrom
Conversation
On AMD APUs with unified memory architecture (e.g., AMD Ryzen AI MAX / Strix Halo, GC 12.0.0), both amdsmi_get_gpu_vram_usage and rsmi_dev_memory_total_get(RSMI_MEM_TYPE_VRAM) return only the BIOS carveout (~512 MiB) instead of the actual usable unified memory. VIS_VRAM (RSMI_MEM_TYPE_VIS_VRAM=2) correctly reports the full system memory on APUs because all system memory is CPU-visible. Add the constant to pyrocmsmi and use max(VRAM, VIS_VRAM) in both detection paths so APUs report the correct memory capacity. Co-authored-by: Readon <3614708+Readon@users.noreply.github.com>
Fix AMD APU unified memory VRAM detection (Strix Halo / Ryzen AI MAX)
Co-authored-by: Readon <3614708+Readon@users.noreply.github.com>
Fix ruff format violations in AMD APU unified memory VRAM detection
There was a problem hiding this comment.
Pull request overview
This PR improves AMD GPU memory detection for APUs with unified memory by incorporating ROCm SMI’s VIS_VRAM reporting when it exceeds the traditional VRAM value, and introduces explicit ROCm SMI memory-type constants to support that logic.
Changes:
- Prefer VIS_VRAM over VRAM for total memory reporting when VIS_VRAM is larger (APU/unified memory scenarios).
- Add ROCm SMI memory type constants (
RSMI_MEM_TYPE_VRAM/GTT/VIS_VRAM) for clearer memory queries.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
gpustack_runtime/detector/amd.py |
Adds VIS_VRAM-based total-memory override in both amdsmi and ROCm SMI code paths. |
gpustack_runtime/detector/pyrocmsmi/__init__.py |
Introduces numeric memory-type constants for ROCm SMI memory queries. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| dev_idx, | ||
| pyrocmsmi.RSMI_MEM_TYPE_VIS_VRAM, | ||
| ), | ||
| ) | ||
| dev_mem = max(dev_mem, dev_mem_vis_vram) |
There was a problem hiding this comment.
dev_gpu_vram_usage.get("vram_total") / get("vram_used") can return None; calling max(dev_mem, dev_mem_vis_vram) will then raise TypeError. Consider defaulting to 0 (or otherwise normalizing to an int) before the max() so the VIS_VRAM fallback cannot crash detection.
| # On APUs with unified memory (e.g., AMD Strix Halo), VRAM | ||
| # reports only the BIOS carveout (~512 MiB); VIS_VRAM reports | ||
| # the full usable system memory. Use VIS_VRAM when larger. | ||
| with contextlib.suppress(pyrocmsmi.ROCMSMIError): | ||
| dev_mem_vis_vram = byte_to_mebibyte( |
There was a problem hiding this comment.
The VIS_VRAM selection logic (comment + suppress block + max) is duplicated in both the pyamdsmi and fallback branches. Consider extracting it into a small helper (or applying it once after the try/except) to avoid divergence if this logic needs future changes.
| ## Memory Types ## | ||
| RSMI_MEM_TYPE_VRAM = 0 | ||
| RSMI_MEM_TYPE_GTT = 1 | ||
| RSMI_MEM_TYPE_VIS_VRAM = 2 | ||
|
|
There was a problem hiding this comment.
New RSMI_MEM_TYPE_* constants are introduced here, but rsmi_dev_memory_total_get / rsmi_dev_memory_usage_get still default to rsmi_memory_type_t.RSMI_MEM_TYPE_VRAM. If rsmiBindings isn't importable (bindings path missing), those defaults can raise NameError even though the library can be loaded. Consider switching the defaults to these new constants (or defining these constants from rsmi_memory_type_t when available, with a numeric fallback) so memory calls work reliably across install layouts.
| ## Memory Types ## | |
| RSMI_MEM_TYPE_VRAM = 0 | |
| RSMI_MEM_TYPE_GTT = 1 | |
| RSMI_MEM_TYPE_VIS_VRAM = 2 | |
| ## Memory Types ## | |
| # Fallback numeric values that work even when rsmiBindings cannot be imported. | |
| RSMI_MEM_TYPE_VRAM = 0 | |
| RSMI_MEM_TYPE_GTT = 1 | |
| RSMI_MEM_TYPE_VIS_VRAM = 2 | |
| # If rsmi_memory_type_t is available from rsmiBindings, prefer the values | |
| # defined there so we stay in sync with the underlying library. | |
| if "rsmi_memory_type_t" in globals(): | |
| try: | |
| RSMI_MEM_TYPE_VRAM = int(rsmi_memory_type_t.RSMI_MEM_TYPE_VRAM) | |
| RSMI_MEM_TYPE_GTT = int(rsmi_memory_type_t.RSMI_MEM_TYPE_GTT) | |
| RSMI_MEM_TYPE_VIS_VRAM = int(rsmi_memory_type_t.RSMI_MEM_TYPE_VIS_VRAM) | |
| except Exception: | |
| # If anything unexpected happens, keep using the numeric fallbacks above. | |
| pass | |
| else: | |
| # When rsmiBindings is not importable, provide a minimal stand-in so that | |
| # references like rsmi_memory_type_t.RSMI_MEM_TYPE_VRAM used in default | |
| # arguments do not raise NameError at import time. | |
| class rsmi_memory_type_t: # type: ignore[override] | |
| RSMI_MEM_TYPE_VRAM = RSMI_MEM_TYPE_VRAM | |
| RSMI_MEM_TYPE_GTT = RSMI_MEM_TYPE_GTT | |
| RSMI_MEM_TYPE_VIS_VRAM = RSMI_MEM_TYPE_VIS_VRAM |
| # On APUs with unified memory (e.g., AMD Strix Halo), VRAM | ||
| # reports only the BIOS carveout (~512 MiB); VIS_VRAM reports | ||
| # the full usable system memory. Use VIS_VRAM when larger. | ||
| with contextlib.suppress(pyrocmsmi.ROCMSMIError): | ||
| dev_mem_vis_vram = byte_to_mebibyte( |
There was a problem hiding this comment.
The new VIS_VRAM behavior isn’t covered by automated tests. Since tests/gpustack_runtime/detector/test_amd.py exists, consider adding a unit-style test that monkeypatches pyamdsmi.amdsmi_get_gpu_vram_usage and pyrocmsmi.rsmi_dev_memory_total_get to validate that VIS_VRAM is preferred when larger and that ROCm SMI errors are safely suppressed.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses an issue with inaccurate VRAM reporting on AMD APUs, especially those with unified memory architectures. It modifies the GPU detection logic to correctly identify and utilize the full usable system memory by checking VIS_VRAM in addition to traditional VRAM values. This ensures that the system provides a more precise representation of available memory resources, improving overall system monitoring and resource allocation. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request aims to improve VRAM detection for AMD APUs. I've found a critical issue with incorrect memory type constants which will cause the feature to not work as intended. Additionally, there's a code duplication issue that should be addressed to improve maintainability.
| RSMI_MEM_TYPE_GTT = 1 | ||
| RSMI_MEM_TYPE_VIS_VRAM = 2 |
There was a problem hiding this comment.
The values for RSMI_MEM_TYPE_GTT and RSMI_MEM_TYPE_VIS_VRAM appear to be swapped. According to the ROCm SMI library documentation (which is referenced in this file's comments), RSMI_MEM_TYPE_VIS_VRAM should be 1 and RSMI_MEM_TYPE_GTT should be 2. With the current values, the code will query for GTT memory instead of visible VRAM, leading to incorrect memory reporting for APUs.
| RSMI_MEM_TYPE_GTT = 1 | |
| RSMI_MEM_TYPE_VIS_VRAM = 2 | |
| RSMI_MEM_TYPE_GTT = 2 | |
| RSMI_MEM_TYPE_VIS_VRAM = 1 |
| # On APUs with unified memory (e.g., AMD Strix Halo), VRAM | ||
| # reports only the BIOS carveout (~512 MiB); VIS_VRAM reports | ||
| # the full usable system memory. Use VIS_VRAM when larger. | ||
| with contextlib.suppress(pyrocmsmi.ROCMSMIError): | ||
| dev_mem_vis_vram = byte_to_mebibyte( | ||
| pyrocmsmi.rsmi_dev_memory_total_get( | ||
| dev_idx, | ||
| pyrocmsmi.RSMI_MEM_TYPE_VIS_VRAM, | ||
| ), | ||
| ) | ||
| dev_mem = max(dev_mem, dev_mem_vis_vram) |
There was a problem hiding this comment.
This block of code for checking VIS_VRAM is duplicated on lines 203-213. This duplication can lead to maintenance issues in the future, for example, if one block is updated but the other is missed. To improve maintainability, consider refactoring this logic into a helper function or restructuring the code to perform this check only once after the initial dev_mem value has been determined.
|
@Readon can you address the comments generated by Gemini? |
|
@Readon, any update? |
Sorry, I still do not have time to finish this. And I found that if I run the latest gpustack/gpustack:v2.1.0 on strix halo, it will crash. |
|
I need to find a time to test this on the machine. |
Do you mean the host crash? |
yes the host crashed |
This pull request improves the detection of available memory on AMD GPUs, especially for APUs with unified memory (such as AMD Strix Halo). The changes ensure that the system reports the correct amount of usable memory by considering both VRAM and VIS_VRAM, addressing cases where traditional VRAM reporting is insufficient. Additionally, new memory type constants have been introduced for clarity and future usage.
Memory detection improvements:
gpustack_runtime/detector/amd.pyto check for VIS_VRAM memory and use it when it exceeds reported VRAM, ensuring accurate memory reporting on APUs with unified memory. [1] [2]Constants and definitions:
RSMI_MEM_TYPE_VRAM,RSMI_MEM_TYPE_GTT,RSMI_MEM_TYPE_VIS_VRAM) ingpustack_runtime/detector/pyrocmsmi/__init__.pyto support the improved memory detection logic.