Skip to content

Conversation

@mdboom
Copy link
Contributor

@mdboom mdboom commented Feb 12, 2026

This allows setuptools-scm to work when our projects are built from a git archive, which
includes how our official conda packages are built.

Useful references for existing git archive usage is:

There are some important differences here from how the setuptools-scm docs suggest:

  • Since we are a monorepo, we need a .git_archival.txt file at the root of each of our subprojects. We can't just copy this file there, because git archive will only perform one git describe substitution per repo to avoid a denial-of-service attack. Instead, there is a single .git_archival.txt file at the root of this repository, and symlinks to it from each of the subprojects. This may break if creating git archives on Windows, which we don't currently do.

  • Using the suggested describe-name: $Format:%(describe:tags=true,match=*[0-9]*)$ doesn't work for our tagging scheme where there may be a project prefix, e.g. cuda-pathfinder-v1.3.4. The bug exists in setuptools-scm where it assumes a version in a .git_archival.txt file doesn't have a prefix and is /exactly/ what should be used as the version in the package. Instead, we use the ref-names approach, which includes more detailed information about the available tags, and setuptools-scm falls back to the more customizable form of version parsing. The docs warn that this may cause different checksums in the git-archive from the same tag. I don't know if this will be a problem for us in practice, since we do it once in CI.

With this change, the workaround in our Conda feedstock of SETUPTOOLS_SCM_PRETEND_VERSION_FOR_CUDA_PATHFINDER=${{ version }} should no longer be necessary, though I think it is harmless to leave it that way.

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Feb 12, 2026

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@mdboom mdboom requested review from cpcloud and rwgk February 12, 2026 15:52
@@ -1,5 +1,3 @@
cuda/_version.py export-subst
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a file that hasn't existed in the repo for a long time. Just removing it for cleanup -- not strictly necessary.

additional_dependencies:
- https://files.pythonhosted.org/packages/cc/20/ff623b09d963f88bfde16306a54e12ee5ea43e9b597108672ff3a408aad6/pathspec-0.12.1-py3-none-any.whl
exclude: '.*pixi\.lock'
exclude: '(.*pixi\.lock)|(\.git_archival\.txt)'
Copy link
Contributor Author

@mdboom mdboom Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't put any "extra" content in this file, including a copyright header, so we need to have the SPDX precommit ignore it.

@rwgk
Copy link
Collaborator

rwgk commented Feb 12, 2026

This may break if creating git archives on Windows, which we don't currently do.

I had run-ins with softlinks (similar experience to getting that mosquito buzzing around your ears in the middle of the night).

We are solidly vested in pre-commit:

Could we make it so that pre-commit enforces that the copies are in sync? ("One Cursor prompt and it's done?")

# SCM syntax highlighting & preventing 3-way merges
pixi.lock merge=binary linguist-language=YAML linguist-generated=true

.git_archival.txt export-subst
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no immediately obvious connection from .git_archival.txt to setuptools-scm. If you know it, good, if you don't you have to go through a couple hoops to discover what this is for. Easy idea to help at least a little bit:

# See PR #1609
.git_archival.txt export-subst

Maybe there is a better way to make the connection more discoverable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment here.

@@ -0,0 +1,3 @@
node: $Format:%H$
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To document this here: I asked Cursor(GPT-5.3 Codex Extra High):

Do you know if we can add comments at the top of the file? Without, there is no immediately obvious connection to setuptools-scm.

The response:

  Short answer: not as normal comment lines.
  • setuptools-scm parses .git_archival.txt as RFC822-style headers (key: value), not as a general text file.
  • A top line like # this is for setuptools-scm is a non-header line, which stops header parsing, so node, node-date, etc. may not be read.
  • Safe alternative inside the file: add an extra header field, e.g. x-setuptools-scm: git archive metadata (unknown keys are ignored).
  • Best place for human-readable explanation is usually .gitattributes, e.g. a comment above .git_archival.txt export-subst.

I saw Cursor writing code to actually try it out locally, so I believe this is a reliable/conclusive finding.

@mdboom
Copy link
Contributor Author

mdboom commented Feb 12, 2026

Could we make it so that pre-commit enforces that the copies are in sync? ("One Cursor prompt and it's done?")

How this works is that any file with the export-subst attribute will get parsed and have the values filled in when git archive builds the archive. However, according to the git docs, "However, only one %(describe) placeholder is expanded per archive to avoid denial-of-service attacks." So if you have multiple files, all but the first one doesn't get filled in. So any solution with multiple copies of the file will fail.

@mdboom
Copy link
Contributor Author

mdboom commented Feb 12, 2026

/ok to test

@github-actions

This comment has been minimized.

@mdboom mdboom merged commit 3e022ad into NVIDIA:main Feb 12, 2026
86 checks passed
@github-actions

This comment has been minimized.

@rwgk
Copy link
Collaborator

rwgk commented Feb 12, 2026

FYI - sharing observations

Cursor is telling me that the next conda pathfinder release will work, based on this PR, even with the SETUPTOOLS_SCM_PRETEND_VERSION_FOR_CUDA_PATHFINDER override removed again.

However, installing from a github zip file still does not work. Concretely, this reproducer is still failing:

#1454 (comment)

I tried to find a fix, but it seems to be somewhere between complicated and impractical. Is this actually important for us?

@github-actions
Copy link

Doc Preview CI
Preview removed because the pull request was closed or merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants