Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building repo, getting "undefined symbol: setupterm" #7

Open
dbeckwith opened this issue Nov 23, 2021 · 34 comments
Open

Building repo, getting "undefined symbol: setupterm" #7

dbeckwith opened this issue Nov 23, 2021 · 34 comments
Labels
bug Something isn't working C-rustc_codegen_nvvm Category: the NVVM Rustc codegen

Comments

@dbeckwith
Copy link

Bear with me here as I'm on NixOS so installing the dependencies has been a journey. I've cloned this repo and am just trying to run cargo build. I've gotten as far as installing CUDA and OptiX, to the point where it's actually building the path_tracer crate, but now I'm getting some scary codegen errors from rustc:

error: failed to run custom build command for `path_tracer v0.1.0 ($REPO_ROOT/examples/cuda/cpu/path_tracer)`

Caused by:
  process didn't exit successfully: `$REPO_ROOT/target/debug/build/path_tracer-970d6a9b9c38170f/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-changed=../../gpu/path_tracer_gpu

  --- stderr
  warning: $REPO_ROOT/crates/cust/Cargo.toml: `default-features = [".."]` was found in [features]. Did you mean to use `default = [".."]`?
  error: failed to run `rustc` to learn about target-specific information

  Caused by:
    process didn't exit successfully: `rustc - --crate-name ___ --print=file-names -Zcodegen-backend=$REPO_ROOT/target/debug/deps/librustc_codegen_nvvm.so -Cllvm-args=-arch=compute_61 --target nvptx64-nvidia-cuda --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro --print=sysroot --print=cfg` (exit status: 1)
    --- stderr
    error: couldn't load codegen backend "$REPO_ROOT/target/debug/deps/librustc_codegen_nvvm.so": "$REPO_ROOT/target/debug/deps/librustc_codegen_nvvm.so: undefined symbol: setupterm"

  thread 'main' panicked at 'Did not find output file in rustc output', crates/cuda_builder/src/lib.rs:444:10
  stack backtrace:
     0: rust_begin_unwind
               at /rustc/4e89811b46323f432544f9c4006e40d5e5d7663f/library/std/src/panicking.rs:517:5
     1: core::panicking::panic_fmt
               at /rustc/4e89811b46323f432544f9c4006e40d5e5d7663f/library/core/src/panicking.rs:100:14
     2: core::panicking::panic_display
               at /rustc/4e89811b46323f432544f9c4006e40d5e5d7663f/library/core/src/panicking.rs:64:5
     3: core::option::expect_failed
               at /rustc/4e89811b46323f432544f9c4006e40d5e5d7663f/library/core/src/option.rs:1637:5
     4: core::option::Option<T>::expect
               at /rustc/4e89811b46323f432544f9c4006e40d5e5d7663f/library/core/src/option.rs:708:21
     5: cuda_builder::get_last_artifact
               at $REPO_ROOT/crates/cuda_builder/src/lib.rs:432:16
     6: cuda_builder::invoke_rustc
               at $REPO_ROOT/crates/cuda_builder/src/lib.rs:417:20
     7: cuda_builder::CudaBuilder::build
               at $REPO_ROOT/crates/cuda_builder/src/lib.rs:238:20
     8: build_script_build::main
               at ./build.rs:4:5
     9: core::ops::function::FnOnce::call_once
               at /rustc/4e89811b46323f432544f9c4006e40d5e5d7663f/library/core/src/ops/function.rs:227:5
  note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
warning: build failed, waiting for other jobs to finish...
error: build failed

Any tips on what do here?

Versions:

  • CUDA 11.4.2
  • OptiX 7.3.0
  • LLVM 7.1.0
  • NVidia driver 470.63.01
  • rustc 1.57.0-nightly (4e89811b4 2021-10-16)
@RDambrosio016
Copy link
Member

RDambrosio016 commented Nov 23, 2021

Ive been made aware of this issue before, but im not quite sure what the best way of solving it is. Apparently this sometimes happens with LLVM and we need to pass -ltinfo to get it to link in terminfo. However ive also seen reports of this causing other link failures...
@anderslanglands what do you think? have you had this issue?

@RDambrosio016 RDambrosio016 added bug Something isn't working C-rustc_codegen_nvvm Category: the NVVM Rustc codegen labels Nov 23, 2021
@anderslanglands
Copy link
Collaborator

anderslanglands commented Nov 24, 2021 via email

@RDambrosio016
Copy link
Member

shouldn't this be fixed by linking in system-libs? why does llvm not include that?

@dbeckwith
Copy link
Author

Is libtinfo something I might need to install?

@dbeckwith
Copy link
Author

I guess it's part of ncurses right? Should I try installing that? I'm on NixOS so I don't have many common libs installed system-wide by default.

@anderslanglands
Copy link
Collaborator

anderslanglands commented Nov 24, 2021 via email

@anderslanglands
Copy link
Collaborator

Hmm this is what link_llvm_system_libs() does in rustc_codegen_nvvm/build.rs. It surprises me that you're getting a missing symbol rather than a library not found error. Could you check:

  1. What the output of llvm-config --system-libs gives you, and
  2. Check that link_llvm_system_libs() is actually being called in your build (stick a panic in or something)

@dbeckwith
Copy link
Author

When I run llvm-config --system-libs I get no output at all. What's the expected output from that?

@anderslanglands
Copy link
Collaborator

It should tell you which system libs llvm is linked against. For instance for my build of llvm on ubuntu 18.04 is gives me:

REZ➞  llvm-config  --system-libs
-lz -lrt -ldl -ltinfo -lpthread -lm -lxml2

@anderslanglands
Copy link
Collaborator

How did you install llvm?

@dbeckwith
Copy link
Author

I installed it from nixpkgs, NixOS's package management system. There's both a llvm and libllvm package, but they're giving me the same results.

@dbeckwith
Copy link
Author

It's possible I've just not installed LLVM properly, I can look into that and try again once I have llvm-config --system-libs showing something.

@dbeckwith
Copy link
Author

Can I ask how you installed LLVM such that it has those system-libs show up? What Debian package would you use?

@anderslanglands
Copy link
Collaborator

anderslanglands commented Nov 24, 2021 via email

@dbeckwith
Copy link
Author

I tried installing the llvm-7-dev APT package on Ubuntu 18.04 and it also gives no output for llvm-config --system-libs. Is there some other build-time flag needed to get that work?

Either way, it's starting to look like a standard install of LLVM doesn't report system libs this way. Maybe link_llvm_system_libs() should find a different way to discover these libs?

@RDambrosio016
Copy link
Member

Someone told me that LLVM only gives system libs if linking statically. I used a lot of rustc's build.rs logic so we inherited a bit of dynamic vs shared linking stuff. i think we should always link statically and use llvm-config --link-static --system-libs

@anderslanglands
Copy link
Collaborator

anderslanglands commented Nov 25, 2021 via email

@dbeckwith
Copy link
Author

llvm-config --link-static --system-libs gives me -lz -lrt -ldl -ltinfo -lpthread -lm -lxml2

@RDambrosio016
Copy link
Member

Then i think we should remove dynamic linking stuff and just always statically link, dynamically linking in the codegen doesnt make much sense

@dbeckwith
Copy link
Author

I edited the codegen crate build script to use --link-static and now I'm getting a different error from cc that it can't find the cuda library. I noticed in the error output that the cc flags include -L $CUDA_PATH/lib64, but my libcuda.so is in $CUDA_PATH/lib/stubs. I saw this TODO, maybe that logic needs tweaking?

@Stupremee
Copy link

What happens if you set LLVM_LINK_SHARED=1 when building?

On Thu, 25 Nov 2021 at 09:56, Daniel Beckwith @.***> wrote: I tried installing the llvm-7-dev APT package on Ubuntu 18.04 and it also gives no output for llvm-config --system-libs. Is there some other build-time flag needed to get that work? Either way, it's starting to look like a standard install of LLVM doesn't report system libs this way. Maybe link_llvm_system_libs() should find a different way to discover these libs? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOYQXPALXODTFHG6EI57CLUNVGO7ANCNFSM5IUSDHKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

I had the same issue and fixed it by setting this environment variable.

@dbeckwith
Copy link
Author

dbeckwith commented Nov 25, 2021

Hmm that might not be right actually. After tweaking a couple other things (it couldn't find -lxml2 so I just manually set the list of libs in link_llvm_system_libs() for now), everything builds fine, but the final binary links to libcuda.so.1 which I don't actually have. My $CUDA_PATH/lib/stubs folder only has libcuda.so and not libcuda.so.1. I'm not sure why these libs are in a folder named "stubs". $CUDA_PATH/lib looks like a more normal libs folder with lots of symlinks, but doesn't contain a libcuda.so.

@dbeckwith
Copy link
Author

dbeckwith commented Nov 25, 2021

LLVM_LINK_SHARED=1 does work for me as well. Now the only issue I have is not finding libcuda.

@Stupremee
Copy link

Would you mind sharing your NixOS shell configuration that you use for using Rust-CUDA? Even though it doesn't work yet, I would be interested anyway

@dbeckwith
Copy link
Author

dbeckwith commented Nov 25, 2021

@Stupremee https://gist.github.com/dbeckwith/bc3baade147ebff905a72c434812053d

There's no OptiX package in nixpkgs so I had to include a bespoke package for it. Unfortunately I couldn't find any public download URLs for it so you have to make an NVidia account, sign up for the developer program, download it manually, and add it to the Nix store (instructions are in the derivation). Also, fair warning that the CUDA package is a 3.8 GB download so it'll hang for a while.

@RDambrosio016
Copy link
Member

0.2 should work, i switched to cuda-sys' linux handling logic which should be more robust, and it should not fail to find cuda now

@dbeckwith
Copy link
Author

Thanks for the update, but I'm still seeing the following issues:

  1. The original error in this thread still occurs. I can get around it by patching detect_llvm_link to always return ("dylib", "--link-shared") so it at least gets past the build step.
  2. The built binary is still linking to libcuda.so.1 which doesn't exist in my installation. I only have $CUDA_PATH/lib64/stubs/libcuda.so. As far as I can tell the Nix package installer doesn't do anything that might remove libcuda.so.1. Does this file exist in the standard Linux installation? Is there a way to change the linker to link to just libcuda.so? It's possible this is only an issue when detect_llvm_link returns dylib, so maybe fixing 1. will fix this as well?

@dbeckwith
Copy link
Author

Sorry I forgot that 1. can be fixed by setting LLVM_LINK_SHARED=1 while building, although I'm not sure that's a permanent solution, but 2. is still an issue.

@1617176084
Copy link

Sorry I forgot that 1. can be fixed by setting LLVM_LINK_SHARED=1 while building, although I'm not sure that's a permanent solution, but 2. is still an issue.

--link-shared and LLVM_LINK_SHARED=1 ,Where should I input these two? Could you tell me the steps?

@dbeckwith
Copy link
Author

--link-shared and LLVM_LINK_SHARED=1 ,Where should I input these two? Could you tell me the steps?

It's an environment variable, so for example:

$ LLVM_LINK_SHARED=1 cargo build

or:

$ export LLVM_LINK_SHARED=1
$ cargo run --bin path_tracer

Setting this environment variable will force the build script to use --link-shared in the LLVM args:

fn detect_llvm_link() -> (&'static str, &'static str) {
// Force the link mode we want, preferring static by default, but
// possibly overridden by `configure --enable-llvm-link-shared`.
if tracked_env_var_os("LLVM_LINK_SHARED").is_some() {
("dylib", "--link-shared")
} else {
("static", "--link-static")
}
}

@nottug
Copy link

nottug commented Feb 6, 2022

I'm running on arch using AUR package llvm70 (which installs to /opt/llvm70), with LLVM_LINK_SHARED=1 and getting

error: couldn't load codegen backend "/PATH/TO/APPLICATION/target/debug/librustc_codegen_nvvm.so": "libLLVM-7.so: cannot open shared object file: No such file or directory"

I had to run this to resolve it (rust-lang/rust #53813).

ln -s /opt/llvm70/lib/libLLVM-7.so $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/lib/

Could be that my llvm isn't linked properly, but my LLVM_CONFIG is exported as /opt/llvm70/bin/llvm-config.

Versions:

  • Linux Kernel: 5.16.0-arch1-1
  • CUDA Driver Version: 495.46
  • CUDA Version: 11.5
  • LLVM: 7.0.1
  • Rust: rustc 1.59.0-nightly (532d2b14c 2021-12-03)

@RDambrosio016
Copy link
Member

I think that this is all caused by trying to link llvm dynamically, we should probably always link statically in 0.3

@gzz2000
Copy link

gzz2000 commented Feb 26, 2022

I also can confirm that LLVM_LINK_SHARED=1 works.

@Netherdrake
Copy link

+1 LLVM_LINK_SHARED=1 works on Ubuntu 20.04 with llvm-7 installed via apt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working C-rustc_codegen_nvvm Category: the NVVM Rustc codegen
Projects
None yet
Development

No branches or pull requests

8 participants