Skip to content

Conversation

@cpcloud
Copy link
Contributor

@cpcloud cpcloud commented Feb 10, 2026

Problem

libnvvm.so lives in $CTK_ROOT/nvvm/lib64/, which is not on the system linker path. Every other CTK lib lives in $CTK_ROOT/lib64/ which ldconfig knows about. This means dlopen("libnvvm.so.4") fails on bare system CTK installs when CUDA_HOME is not set -- even though the library is right there on disk.

Pathfinder already handles nvvm for pip (nvidia-cuda-nvcc wheel) and conda ($CONDA_PREFIX/nvvm/lib64/), so the gap is specifically: system CTK, no CUDA_HOME, no pip, no conda.

Solution

Add a canary probe as a new search step. When direct system search fails:

  1. System-load a well-known CTK lib that IS on the linker path (cudart)
  2. Derive the CTK installation root from its resolved absolute path (strip lib64/ on Linux, bin/ or bin/x64/ on Windows)
  3. Look for the target lib relative to that root, reusing the existing _find_lib_dir_using_anchor_point which already knows nvvm lives in nvvm/lib64/

The mechanism is generic -- any future lib with a non-standard sub-path just needs its entry in the anchor-point table.

Search order

The canary fires after CUDA_HOME to preserve backward compatibility:

site-packages → conda → already-loaded → system dlopen → CUDA_HOME → canary probe

Users with CUDA_HOME set expect it to be authoritative; the canary is a last resort for when nothing else works.

Edge cases

Scenario Outcome
pip/conda installed Found via existing site-packages/conda steps; canary never runs
CUDA_HOME set Found via CUDA_HOME; canary never runs
System CTK, no CUDA_HOME, nvvm is first request Canary loads cudart via system search, derives root, finds nvvm
System CTK, no CUDA_HOME, other libs loaded first System search for nvvm still fails, canary fires and succeeds
No CTK anywhere Canary fails (can't find cudart either), raises DynamicLibNotFoundError
Canary finds cudart but CTK root has no nvvm Returns None, falls through to error
Canary path doesn't match known layout Returns None, falls through to error

Changes

  • find_nvidia_dynamic_lib.py -- derive_ctk_root() + try_via_ctk_root() on _FindNvidiaDynamicLib
  • load_nvidia_dynamic_lib.py -- _try_ctk_root_canary() wired into the cascade
  • tests/test_ctk_root_discovery.py -- 21 tests covering all of the above

Made with Cursor

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Feb 10, 2026

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 10, 2026

/ok to test

1 similar comment
@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 10, 2026

/ok to test

cpcloud and others added 2 commits February 10, 2026 13:49
Libraries like nvvm whose shared object lives in a subdirectory
(/nvvm/lib64/) that is not on the system linker path cannot
be found via bare dlopen on system CTK installs without CUDA_HOME.

Add a "canary probe" search step: when direct system search fails,
system-load a well-known CTK lib that IS on the linker path (cudart),
derive the CTK installation root from its resolved path, and look for
the target lib relative to that root via the existing anchor-point
logic. The mechanism is generic -- any future lib with a non-standard
path just needs its entry in _find_lib_dir_using_anchor_point.

The canary probe is intentionally placed after CUDA_HOME in the search
cascade to preserve backward compatibility: users who have CUDA_HOME
set expect it to be authoritative, and existing code relying on that
ordering should not silently change behavior.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 10, 2026

/ok to test

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 10, 2026

/ok to test

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 10, 2026

/ok to test



def test_derive_ctk_root_windows_ctk13():
path = r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin\x64\cudart64_13.dll"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works cross-platform due to explicit use of ntpath in _derive_ctk_root_windows. Given that the code won't look much different using the platform specific version versus not, it seems somewhat useful to have these around instead of having to skip a bunch of tests based on platform.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a CTK root canary probe feature to the pathfinder library to resolve libraries that live in non-standard subdirectories (like libnvvm.so under $CTK_ROOT/nvvm/lib64/). The canary probe discovers the CUDA Toolkit installation root by loading a well-known library (cudart) that IS on the system linker path, deriving the CTK root from its resolved path, and then searching for the target library relative to that root.

Changes:

  • Adds canary probe mechanism as a last-resort fallback after CUDA_HOME in the library search cascade
  • Introduces CTK root derivation functions for Linux and Windows that extract installation paths from resolved library paths
  • Provides comprehensive test coverage (21 tests) for all edge cases and search order behavior

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
cuda_pathfinder/tests/test_ctk_root_discovery.py Comprehensive test suite covering CTK root derivation, canary probe mechanism, and search order priority
cuda_pathfinder/cuda/pathfinder/_dynamic_libs/load_nvidia_dynamic_lib.py Implements the canary probe function and integrates it into the library loading cascade after CUDA_HOME
cuda_pathfinder/cuda/pathfinder/_dynamic_libs/find_nvidia_dynamic_lib.py Adds CTK root derivation functions and try_via_ctk_root method to leverage existing anchor-point search logic

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

Tests that create fake CTK directory layouts were hardcoded to Linux
paths (lib64/, libnvvm.so) and failed on Windows where the code
expects Windows layouts (bin/, nvvm64.dll).

Extract platform-aware helpers (_create_nvvm_in_ctk, _create_cudart_in_ctk,
_fake_canary_path) that create the right layout and filenames based on
IS_WINDOWS.

Co-authored-by: Cursor <cursoragent@cursor.com>
@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 10, 2026

/ok to test

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 10, 2026

/ok to test

The rel_paths for nvvm use forward slashes (e.g. "nvvm/bin") which
os.path.join on Windows doesn't normalize, producing mixed-separator
paths like "...\nvvm/bin\nvvm64.dll". Apply os.path.normpath to the
returned directory so all separators are consistent.

Co-authored-by: Cursor <cursoragent@cursor.com>
@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 10, 2026

/ok to test

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is very similar to what I had back in May 2025 while working on PR #604, but at the time @leofang was strongly opposed to it, and I backed it out.

I still believe the usefulness/pitfall factor is very high for this approach. Leo, what's your opinion now?

If Leo is supportive, I believe it'll be best to import the anchor library (cudart) in a subprocess, to not introduce potentially surprising side-effects in the current process. The original code for that was another point of contention back in May 2025 (it was using subprocess), but in the meantime I addressed those concerns and the current implementation has gone through several rounds of extensive testing (QA) without any modifications for many months. We could easily move it to cuda/pathfinder/_utils.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants