Skip to content

Fix requires_lto targets needing lto set in cargo#149624

Merged
rust-bors[bot] merged 1 commit intorust-lang:mainfrom
Flakebi:fix-lto
Apr 27, 2026
Merged

Fix requires_lto targets needing lto set in cargo#149624
rust-bors[bot] merged 1 commit intorust-lang:mainfrom
Flakebi:fix-lto

Conversation

@Flakebi
Copy link
Copy Markdown
Contributor

@Flakebi Flakebi commented Dec 4, 2025

View all comments

Targets that set requires_lto = true were not actually using lto when compiling with cargo by default. They needed an extra lto = true in Cargo.toml to work.

Fix this by letting lto take precedence over the embed_bitcode flag when lto is required by a target.

If both these flags would be supplied by the user, an error is generated. However, this did not happen when lto was requested by the target instead of the user.

Fixes #148514
Tracking issue: #135024

@rustbot rustbot added A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 4, 2025
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Dec 4, 2025

r? @jieyouxu

rustbot has assigned @jieyouxu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Dec 4, 2025

Some changes occurred in src/doc/rustc/src/platform-support

cc @Noratrieb

@Flakebi Flakebi mentioned this pull request Dec 4, 2025
26 tasks
@rust-log-analyzer

This comment has been minimized.

@jieyouxu
Copy link
Copy Markdown
Member

jieyouxu commented Dec 4, 2025

Hm, not entirely sure of the implications.
@rustbot reroll

@rustbot rustbot assigned Mark-Simulacrum and unassigned jieyouxu Dec 4, 2025
@Flakebi
Copy link
Copy Markdown
Contributor Author

Flakebi commented Dec 4, 2025

The CI/tidy complain seems like a false-positive, the test uses --target, just not through //@ compile-flags:. Maybe that lint shouldn’t run on run-make-cargo tests?
(I’m happy to make the change, just want to get confirmation that it’s ok to do that.)

@jieyouxu
Copy link
Copy Markdown
Member

jieyouxu commented Dec 4, 2025

The CI/tidy complain seems like a false-positive, the test uses --target, just not through //@ compile-flags:. Maybe that lint shouldn’t run on run-make-cargo tests? (I’m happy to make the change, just want to get confirmation that it’s ok to do that.)

Yes, I probably forgot to extend the tidy exception to run-make-cargo when I split the previous run-make test suite.

@rustbot rustbot added A-tidy Area: The tidy tool T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Dec 4, 2025
@rust-log-analyzer

This comment has been minimized.

JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Dec 22, 2025
…lathar

Skip tidy target-specific check for `run-make-cargo` too

I forgot to change this when implementing the run-make fission.

Noticed in rust-lang#149624 (comment).
rust-timer added a commit that referenced this pull request Dec 22, 2025
Rollup merge of #150237 - jieyouxu:tidy-run-make-cargo, r=Zalathar

Skip tidy target-specific check for `run-make-cargo` too

I forgot to change this when implementing the run-make fission.

Noticed in #149624 (comment).
@bors
Copy link
Copy Markdown
Collaborator

bors commented Dec 22, 2025

☔ The latest upstream changes (presumably #150240) made this pull request unmergeable. Please resolve the merge conflicts.

@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Dec 24, 2025

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@Flakebi
Copy link
Copy Markdown
Contributor Author

Flakebi commented Dec 24, 2025

Rebased to fix conflicts with #150237, no other changes

Comment thread compiler/rustc_codegen_ssa/src/back/write.rs Outdated
@Flakebi
Copy link
Copy Markdown
Contributor Author

Flakebi commented Dec 29, 2025

I was too early claiming that it works, I tested some more (compiling amdgpu-rs examples) and it broke some of them (somehow it ignored CARGO_BUILD_RUSTFLAGS for obj_is_bitcode and EmitObj::Bitcode completely or in parts).
With EmitObj::ObjectCode(BitcodeSection::Full) everything seems to work.
So, this is pretty much the change from before now, but checking requires_lto instead of lto != No.

@bjorn3
Copy link
Copy Markdown
Member

bjorn3 commented Dec 31, 2025

What error did you get when emitting raw bitcode files?

@Flakebi
Copy link
Copy Markdown
Contributor Author

Flakebi commented Jan 1, 2026

Here are the errors I get for two changes I tested.

Either of the below changes fails to compile examples/vector_copy due to a workaround for the .kd symbol.

Click to see detailed error
$ cd examples/vector_copy
$ CARGO_BUILD_RUSTFLAGS='-Ctarget-cpu=gfx1010 -Ctarget-feature=-xnack-support' cargo +stage1 build --release
   Compiling compiler_builtins v0.1.160 (/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/compiler-builtins/compiler-builtins)
   Compiling core v0.0.0 (/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core)
   Compiling vector_copy v0.1.0 (/rusttest/amdgpu-rs/examples/vector_copy)
warning: unknown and unstable feature specified for `-Ctarget-feature`: `xnack-support`
  |
  = note: it is still passed through to the codegen backend, but use of this feature might be unsound and the behavior of this feature can change in the future
  = help: consider filing a feature request

error: linking with `rust-lld` failed: exit status: 1
  |
  = note:  "rust-lld" "-flavor" "gnu" "--version-script=/tmp/nix-shell.cewhbS/rustc2xMArM/list" "--no-undefined-version" "<1 object files omitted>" "--as-needed" "-L" "/tmp/nix-shell.cewhbS/rustc2xMArM/raw-dylibs" "-Bdynamic" "--eh-frame-hdr" "-z" "noexecstack" "-o" "/rusttest/amdgpu-rs/examples/vector_copy/target/amdgcn-amd-amdhsa/release/deps/vector_copy.elf" "--gc-sections" "-shared" "-O1" "--strip-debug"
  = note: some arguments are omitted. use `--verbose` to show all linker arguments
  = note: rust-lld: error: version script assignment of 'global' to symbol 'kernel.kd' failed: symbol not defined


warning: `vector_copy` (lib) generated 1 warning
error: could not compile `vector_copy` (lib) due to 1 previous error; 1 warning emitted

I’m fixing this by removing the .kd symbol workaround (wrapping the symbol_export::extend_exported_symbols call in if false) and adding link-args to not remove the .kd symbol.

Change 1

Using obj_is_bitcode=true instead of requires_lto in amdgpu:

The vector_copy example compiles successfully but trying to run fails because it compiled for the wrong target (requested gfx1010 but got gfx700).

$ cd examples/vector_copy
$ CARGO_BUILD_RUSTFLAGS='-Ctarget-cpu=gfx1010 -Ctarget-feature=-xnack-support -Clink-arg=--undefined-version -Clink-arg=--no-gc-sections' cargo +stage1 build --release
$ llvm-readelf -a target/amdgcn-amd-amdhsa/release/vector_copy.elf | rg target
amdhsa.target:   amdgcn-amd-amdhsa--gfx700

The println example fails to compile somewhat similar, apparently xnack flags are not consistent between compiled objects.

Click to see detailed error
$ cd examples/println
$ CARGO_BUILD_RUSTFLAGS='-Ctarget-cpu=gfx1010 -Ctarget-feature=-xnack-support' cargo +stage1 build --release
   Compiling compiler_builtins v0.1.160 (/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/compiler-builtins/compiler-builtins)
   Compiling core v0.0.0 (/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core)
   Compiling rustflags v0.1.6
   Compiling amdgpu-device-libs-build v0.1.0 (/rusttest/amdgpu-rs/amdgpu-device-libs-build)
   Compiling println v0.1.0 (/rusttest/amdgpu-rs/examples/println)
   Compiling alloc v0.0.0 (/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/alloc)
   Compiling amdgpu-device-libs v0.1.0 (/rusttest/amdgpu-rs/amdgpu-device-libs)
warning: unknown and unstable feature specified for `-Ctarget-feature`: `xnack-support`
  |
  = note: it is still passed through to the codegen backend, but use of this feature might be unsound and the behavior of this feature can change in the future
  = help: consider filing a feature request

warning: `amdgpu-device-libs` (lib) generated 1 warning
error: linking with `rust-lld` failed: exit status: 1
  |
  = note:  "rust-lld" "-flavor" "gnu" "--version-script=/tmp/nix-shell.YJcp1H/rustcRBUJWw/list" "--no-undefined-version" "<2 object files omitted>" "--as-needed" "-Bstatic" "/rusttest/amdgpu-rs/examples/println/target/amdgcn-amd-amdhsa/release/deps/{libamdgpu_device_libs-b3e559c59352d88b,liballoc-cc2fa346781d4ed5,libcore-595a027f1d29f784,libcompiler_builtins-4c3afce4ad3c6e2b}.rlib" "-L" "/tmp/nix-shell.YJcp1H/rustcRBUJWw/raw-dylibs" "-Bdynamic" "--eh-frame-hdr" "-z" "noexecstack" "-plugin-opt=O3" "-plugin-opt=mcpu=gfx1010" "-o" "/rusttest/amdgpu-rs/examples/println/target/amdgcn-amd-amdhsa/release/deps/println.elf" "--gc-sections" "-shared" "-O1" "--strip-debug" "/nix/store/4srmqmw5y2zff1h41cwz35yq64fmbb63-rocm-device-libs-19.0.0-rocm/amdgcn/bitcode/ockl.bc" "/nix/store/4srmqmw5y2zff1h41cwz35yq64fmbb63-rocm-device-libs-19.0.0-rocm/amdgcn/bitcode/oclc_isa_version_1010.bc" "/nix/store/4srmqmw5y2zff1h41cwz35yq64fmbb63-rocm-device-libs-19.0.0-rocm/amdgcn/bitcode/oclc_abi_version_600.bc" "/nix/store/4srmqmw5y2zff1h41cwz35yq64fmbb63-rocm-device-libs-19.0.0-rocm/amdgcn/bitcode/oclc_wavefrontsize64_off.bc" "/rusttest/amdgpu-rs/amdgpu-device-libs-build/util32.bc" "--undefined-version" "--no-gc-sections"
  = note: some arguments are omitted. use `--verbose` to show all linker arguments
  = note: rust-lld: warning: /rusttest/amdgpu-rs/examples/println/target/amdgcn-amd-amdhsa/release/deps/libamdgpu_device_libs-b3e559c59352d88b.rlib: archive member 'lib.rmeta' is neither ET_REL nor LLVM bitcode
          rust-lld: warning: /rusttest/amdgpu-rs/examples/println/target/amdgcn-amd-amdhsa/release/deps/liballoc-cc2fa346781d4ed5.rlib: archive member 'lib.rmeta' is neither ET_REL nor LLVM bitcode
          rust-lld: warning: /rusttest/amdgpu-rs/examples/println/target/amdgcn-amd-amdhsa/release/deps/libcore-595a027f1d29f784.rlib: archive member 'lib.rmeta' is neither ET_REL nor LLVM bitcode
          rust-lld: warning: /rusttest/amdgpu-rs/examples/println/target/amdgcn-amd-amdhsa/release/deps/libcompiler_builtins-4c3afce4ad3c6e2b.rlib: archive member 'lib.rmeta' is neither ET_REL nor LLVM bitcode
          rust-lld: error: incompatible xnack: /rusttest/amdgpu-rs/examples/println/target/amdgcn-amd-amdhsa/release/deps/println.elf.lto.liballoc-cc2fa346781d4ed5.rlib(alloc-cc2fa346781d4ed5.alloc.11bdb97dbc2d79b2-cgu.5.rcgu.o at 181598).o


warning: `println` (lib) generated 1 warning (1 duplicate)
error: could not compile `println` (lib) due to 1 previous error; 1 warning emitted

Change 2

Emitting raw bitcode files with || sess.target.requires_lto (and leaving obj_is_bitcode=false for amdgpu):

The vector_copy example compiles successfully but trying to run fails because it compiled for the wrong target (requested gfx1010 but got gfx700).

$ cd examples/vector_copy
$ CARGO_BUILD_RUSTFLAGS='-Ctarget-cpu=gfx1010 -Ctarget-feature=-xnack-support -Clink-arg=--undefined-version -Clink-arg=--no-gc-sections' cargo +stage1 build --release
$ llvm-readelf -a target/amdgcn-amd-amdhsa/release/vector_copy.elf | rg target
amdhsa.target:   amdgcn-amd-amdhsa--gfx700

examples/println works with this change.

Click to see detailed output
$ cd examples/println
$ CARGO_BUILD_RUSTFLAGS='-Ctarget-cpu=gfx1010 -Ctarget-feature=-xnack-support' cargo +stage1 build --release
$ llvm-readelf -a target/amdgcn-amd-amdhsa/release/vector_copy.elf | rg target
amdhsa.target:   amdgcn-amd-amdhsa--gfx1010
$ ../default-cpu/target/debug/default-cpu target/amdgcn-amd-amdhsa/release/println.elf
# Default output of running the example
PASSED!
Free
Finished

@bjorn3
Copy link
Copy Markdown
Member

bjorn3 commented Jan 1, 2026

I see.

I’m fixing this by removing the .kd symbol workaround (wrapping the symbol_export::extend_exported_symbols call in if false) and adding link-args to not remove the .kd symbol.

This kinda makes sense. I'm guessing currently the linker sees the .kd symbol in the object file and links it, while with bitcode only files, the linker doesn't know about the .kd symbol until after LTO is done, which is after determining which symbols to export.

but trying to run fails because it compiled for the wrong target (requested gfx1010 but got gfx700).

The LTO linker plugin should know the correct target cpu thanks to -plugin-opt=mcpu=gfx1010. Can you try -Zbuild-std? Maybe the linker plugin gets confused when it sees object files with different target cpus? Without -Zbuild-std the standard library is probably compiled for gfx700.

The println example fails to compile somewhat similar, apparently xnack flags are not consistent between compiled objects.

The LLVM docs seem to suggest that xnack must be consistent between all object files, which would include the standard library.

@Flakebi
Copy link
Copy Markdown
Contributor Author

Flakebi commented Jan 1, 2026

the linker doesn't know about the .kd symbol until after LTO is done, which is after determining which symbols to export.

Yeah, there’s an open LLVM issue for that, it’s linked somewhere in the amdgpu tracking issue.

Can you try -Zbuild-std? Maybe the linker plugin gets confused when it sees object files with different target cpus? Without -Zbuild-std the standard library is probably compiled for gfx700.

The amdgpu target forces -Zbuild-std (rustc errors out if -Ctarget-cpu is not set explicitly), so this is all with build-std. (Also note that it works fine without this patch and lto=true, or with the current version of this patch.)

@Flakebi
Copy link
Copy Markdown
Contributor Author

Flakebi commented Mar 17, 2026

Friendly ping for review

@Flakebi
Copy link
Copy Markdown
Contributor Author

Flakebi commented Apr 26, 2026

@bjorn3, is this fix something you could get merged?

Would it be worth to have another rust-timer run before? (I see this is currently labelled with perf-regression but that comes from an earlier version of this PR.)

@bjorn3
Copy link
Copy Markdown
Member

bjorn3 commented Apr 26, 2026

I figured @Mark-Simulacrum would approve given that they are assigned. I strongly doubt the current state will cause a perf regression.

@bors r+

@rust-bors
Copy link
Copy Markdown
Contributor

rust-bors Bot commented Apr 26, 2026

📌 Commit 842c087 has been approved by bjorn3

It is now in the queue for this repository.

@rust-bors rust-bors Bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 26, 2026
@bjorn3 bjorn3 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. perf-regression Performance regression. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 26, 2026
@Mark-Simulacrum
Copy link
Copy Markdown
Member

For the future, I don't look at waiting-on-author PRs at all today, so I wouldn't have seen this. I'd recommend updating labels (e.g., @rustbot ready).

JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Apr 26, 2026
Fix requires_lto targets needing lto set in cargo

Targets that set `requires_lto = true` were not actually using lto when compiling with cargo by default. They needed an extra `lto = true` in `Cargo.toml` to work.

Fix this by letting lto take precedence over the `embed_bitcode` flag when lto is required by a target.

If both these flags would be supplied by the user, an error is generated. However, this did not happen when lto was requested by the target instead of the user.

Fixes rust-lang#148514
Tracking issue: rust-lang#135024
@Flakebi
Copy link
Copy Markdown
Contributor Author

Flakebi commented Apr 26, 2026

Ups, I didn’t see the label was pointing to me, sorry. Thanks for approving!

JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Apr 26, 2026
Fix requires_lto targets needing lto set in cargo

Targets that set `requires_lto = true` were not actually using lto when compiling with cargo by default. They needed an extra `lto = true` in `Cargo.toml` to work.

Fix this by letting lto take precedence over the `embed_bitcode` flag when lto is required by a target.

If both these flags would be supplied by the user, an error is generated. However, this did not happen when lto was requested by the target instead of the user.

Fixes rust-lang#148514
Tracking issue: rust-lang#135024
rust-bors Bot pushed a commit that referenced this pull request Apr 26, 2026
…uwer

Rollup of 9 pull requests

Successful merges:

 - #149624 (Fix requires_lto targets needing lto set in cargo)
 - #152443 (NVPTX: Drop support for old architectures and old ISAs)
 - #155317 (`std::io::Take`: Clarify & optimize `BorrowedBuf::set_init` usage.)
 - #155588 (Implement more traits for FRTs)
 - #155682 (Add boxing suggestions for `impl Trait` return type mismatches)
 - #155770 (Avoid misleading closure return type note)
 - #155818 (Convert attribute `FinalizeFn` to fn pointer)
 - #155829 (rustc_attr_parsing: use a `try {}` in `or_malformed`)
 - #155835 (couple of `crate_name` cleanups)
rust-bors Bot pushed a commit that referenced this pull request Apr 27, 2026
Rollup of 12 pull requests

Successful merges:

 - #149624 (Fix requires_lto targets needing lto set in cargo)
 - #155317 (`std::io::Take`: Clarify & optimize `BorrowedBuf::set_init` usage.)
 - #155579 (Make Rcs and Arcs use pointer comparison for unsized types)
 - #155588 (Implement more traits for FRTs)
 - #155708 (Fix heap overflow in slice::join caused by misbehaving Borrow)
 - #155778 (Avoid Vec allocation in TyCtxt::mk_place_elem)
 - #151014 (std: sys: process: uefi: Add program searching)
 - #155682 (Add boxing suggestions for `impl Trait` return type mismatches)
 - #155770 (Avoid misleading closure return type note)
 - #155818 (Convert attribute `FinalizeFn` to fn pointer)
 - #155829 (rustc_attr_parsing: use a `try {}` in `or_malformed`)
 - #155835 (couple of `crate_name` cleanups)
@rust-bors rust-bors Bot merged commit f8e3af4 into rust-lang:main Apr 27, 2026
11 checks passed
@rustbot rustbot added this to the 1.97.0 milestone Apr 27, 2026
rust-timer added a commit that referenced this pull request Apr 27, 2026
Rollup merge of #149624 - Flakebi:fix-lto, r=bjorn3

Fix requires_lto targets needing lto set in cargo

Targets that set `requires_lto = true` were not actually using lto when compiling with cargo by default. They needed an extra `lto = true` in `Cargo.toml` to work.

Fix this by letting lto take precedence over the `embed_bitcode` flag when lto is required by a target.

If both these flags would be supplied by the user, an error is generated. However, this did not happen when lto was requested by the target instead of the user.

Fixes #148514
Tracking issue: #135024
@Flakebi Flakebi deleted the fix-lto branch April 27, 2026 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-run-make Area: port run-make Makefiles to rmake.rs A-tidy Area: The tidy tool S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

amdgcn-amd-amdhsa target broken? Fails to build core

9 participants