Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace uses of __CUDA_ARCH__ and __NVCOMPILER_CUDA_ARCH__ for compile time target version checks #976

Open
brycelelbach opened this issue Mar 29, 2021 · 1 comment
Labels
libcu++ For all items related to libcu++

Comments

@brycelelbach
Copy link
Collaborator

brycelelbach commented Mar 29, 2021

We currently use __CUDA_ARCH__/__NVCOMPILER_CUDA_ARCH__ in a few places that are difficult to replace with if target:

Possible solutions:

  • Don't emit an error for older SMs with NVC++. This would lead to (possibly cryptic) compile time failures in some cases and runtime failures in some cases.
  • Add some sort of compile time "do all targets provide"/"do any target provide" mechanism to <nv/target> that uses NV_TARGET_SM_INTEGER_LIST instead to detect if any of the SMs in the list don't meet the requirements of the feature. This would require some preprocessor logic.
  • Add some sort of static_assert_target facility to NVC++. This wouldn't solve the case of the memcpy_async overloads that should only be present for newer targets.
@jrhemstad jrhemstad added thrust For all items related to Thrust. libcu++ For all items related to libcu++ and removed thrust For all items related to Thrust. labels Feb 22, 2023
@alliepiper alliepiper removed their assignment May 1, 2023
@jarmak-nv jarmak-nv transferred this issue from NVIDIA/libcudacxx Nov 8, 2023
@mfbalin
Copy link
Contributor

mfbalin commented Mar 29, 2024

We compile our library for starting from sm_35 upto sm_90. However, simply including cuda atomic header results in a compile time error. How can we compile our code so that the code path can be enabled only for suitable targets? We utilize the cuCollections library which includes the cuda atomic headers automatically. Since we use the static_map from cuCollections in the host code, I don't know how to get around this limitation.

The PR where we face such a problem: dmlc/dgl#7239

In file included from /home/ubuntu/jenkins/workspace/dgl_PR-7239/third_party/cccl/libcudacxx/include/cuda/std/detail/libcxx/include/atomic:733,
                 from /home/ubuntu/jenkins/workspace/dgl_PR-7239/third_party/cccl/libcudacxx/include/cuda/std/atomic:18,
                 from /home/ubuntu/jenkins/workspace/dgl_PR-7239/graphbolt/../third_party/cccl/libcudacxx/include/cuda/atomic:14,
                 from /home/ubuntu/jenkins/workspace/dgl_PR-7239/graphbolt/../third_party/cuco/include/cuco/detail/open_addressing/kernels.cuh:22,
                 from /home/ubuntu/jenkins/workspace/dgl_PR-7239/graphbolt/../third_party/cuco/include/cuco/detail/open_addressing/open_addressing_impl.cuh:21,
                 from /home/ubuntu/jenkins/workspace/dgl_PR-7239/graphbolt/../third_party/cuco/include/cuco/static_map.cuh:21,
                 from /home/ubuntu/jenkins/workspace/dgl_PR-7239/graphbolt/src/cuda/unique_and_compact_impl.cu:14:
/home/ubuntu/jenkins/workspace/dgl_PR-7239/third_party/cccl/libcudacxx/include/cuda/std/detail/libcxx/include/support/atomic/atomic_cuda.h:12:4: error: #error "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows."
   12 | #  error "CUDA atomics are only supported for sm_60 and up on *nix and sm_70 and up on Windows."
      |    ^~~~~

@PointKernel How can I make use of static_map for sm_70 and above when I compile for targets ranging from sm_35 to sm_90?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libcu++ For all items related to libcu++
Projects
Status: No status
Development

No branches or pull requests

4 participants