Enable clang to compile `<atomic>` in device code #1020

jrhemstad · 2023-05-23T22:02:15Z

We guard the <atomic> header based on the definition of _LIBCUDACXX_HAS_CUDA_ATOMIC_IMPL.

This definition currently only allows nvcc, nvrtc, or nvc++ and excludes clang. We should update it to allow clang.

Based on llvm/llvm-project#58410 (comment), we likely need to version guard this to >=clang13.

Furthermore, before clang17, <cuda/(std/)atomic> requires compiling with -fcuda-allow-variadic-functions. Hopefully there is a macro that lets us test if this is enabled so we can guard based on this as well

Perhaps we should add a new definition

#if defined(_LIBCUDACXX_COMPILER_CLANG_CUDA) && /* clang version >= 13 */ && /* clang device variadics enabled */
   #define _LIBCUDACXX_CLANG_HAS_CUDA_ATOMIC_IMPL
#endif

and update as:

#if defined(_LIBCUDACXX_COMPILER_NVCC) || defined(_LIBCUDACXX_COMPILER_NVRTC) || defined(_LIBCUDACXX_COMPILER_NVHPC) || defined(__CUDACC__) || defined(_LIBCUDACXX_CLANG_HAS_CUDA_ATOMIC_IMPL)
#  define _LIBCUDACXX_HAS_CUDA_ATOMIC_IMPL
#endif

The text was updated successfully, but these errors were encountered:

wmaxey · 2023-05-23T22:55:01Z

IIUC This change would be meant to guard against using atomics on versions of Clang where the PTX or variadic calls are not supported?

Artem-B · 2023-05-24T03:05:46Z

For things to work, users would need both support for variadics (can be enabled with old clang versions via -Xclang -fcuda-allow-variadic-functions ) and libcudacxx enabling atomics.

clang-17 will have variadics enabled by default, but the atomics change would still be useful for users of the older versions. Clang packages in the field tend to be updated rather slowly.

jrhemstad · 2023-05-24T15:25:15Z

Ah, I missed the part about needing variadics enabled.

Is there macro to test if -fcuda-allow-variadic-functions is enabled? We'd want to guard LIBCUDACXX_CLANG_HAS_CUDA_ATOMIC_IMPL on that as well.

Artem-B · 2023-05-24T15:43:31Z

Unfortunately not. Adding it now would not make sense either as clang already enables variadics.

We'd want to guard LIBCUDACXX_CLANG_HAS_CUDA_ATOMIC_IMPL on [variadics support] as well.

I do not think it's necessary. Variadics and atomics support are independent issues. We just happen to run into variadics first.
In order to work out of the box we'll need both, but there's no point to restrict atomics availability.

AustinSchuh · 2023-11-14T07:28:10Z

I just hit this trying to build cuCollections with clang 17. Since it sounds like there is no way to distinguish if clang has -fcuda-allow-variadic-functions enabled, what about starting with only enabling this for clang 17 or newer, and then following up with something which allows older clangs? I'm happy to put a patch together since I'm patching this locally anyways to make progress.

jrhemstad · 2023-11-14T15:26:43Z

@AustinSchuh do you have an example of something that still doesn't work today? I think this issue may already be resolved.

We added clang as a device code compiler to our CI a while back: #493 and looking at one of the most recent CI jobs, it looks like the atomic tests are running fine with clang, e.g., https://github.com/NVIDIA/cccl/actions/runs/6860273081/job/18653871971#step:7:307

That said, it looks like the definition of _LIBCUDACXX_HAS_CUDA_ATOMIC_IMPL hasn't changed, so I'm not 100% sure how the atomic tests are currently passing 😅

@wmaxey or @miscco could you check this out?

AustinSchuh · 2023-11-14T17:21:26Z

@jrhemstad , I don't think I'm doing anything too special. I was trying to get cuco working, but really a C++ file with the following is enough to fail:

#include <cuda/atomic>

I'm using clang 17.0.2, and Debian Bookworm's cuda installation. (I've got a sysroot built up with debootstrap plus installing the cuda libraries inside that sysroot that I'm pointing clang to, but from the digging I've done, I don't think that is causing the problem)

In file included from frc971/orin/cuda.cc:1:
In file included from external/amd64_debian_sysroot//usr/include/cuda/atomic:10:
external/amd64_debian_sysroot//usr/include/cuda/std/atomic:271:32: error: no
      member named '__atomic_thread_fence_cuda' in namespace 'cuda::std::__detail'
  271 |                 std::__detail::__atomic_thread_fence_cuda((int)__m, __detail::__thread_scope...
      |                 ~~~~~~~~~~~~~~~^
external/amd64_debian_sysroot//usr/include/nv/detail/__target_macros:483:61: note: 
      expanded from macro 'NV_DISPATCH_TARGET'
  483 | #  define NV_DISPATCH_TARGET(...)       _NV_TARGET_DISPATCH(__VA_ARGS__)
      |                                                             ^~~~~~~~~~~
external/amd64_debian_sysroot//usr/include/nv/detail/__target_macros:478:100: note: 
      expanded from macro '_NV_TARGET_DISPATCH'
  478 |   ..._NV_BLOCK_EXPAND(_NV_DISPATCH_N_ARY(_NV_TARGET_DISPATCH_HANDLE, __VA_ARGS__))
      |                                                                      ^~~~~~~~~~~
external/amd64_debian_sysroot//usr/include/nv/detail/__preprocessor:115:65: note: 
      expanded from macro '_NV_DISPATCH_N_ARY'
  115 | #define _NV_DISPATCH_N_ARY(name, ...) _NV_DISPATCH_N_IMPL(name, __VA_ARGS__)
      |                                                                 ^~~~~~~~~~~

The error above goes on quite a bit longer and repeats 3 times, but I don't think the rest of the error adds value. The problem is what is described in this ticket.

If I add -D_LIBCUDACXX_HAS_CUDA_ATOMIC_IMPL to the compilation command, it compiles successfully.

Holler if more info would be helpful.

jrhemstad · 2023-11-14T18:02:47Z

Could you try pulling the latest main and see if it still reproduces? The version of libcu++ that shipped with your CTK may not include the relevant changes.

AustinSchuh · 2023-11-15T04:17:22Z

Success! I can confirm that the latest main (once I wrangled my compiler to use it) doesn't have this problem.

jrhemstad mentioned this issue May 23, 2023

CUDA device code does not support variadic functions llvm/llvm-project#58410

Open

jarmak-nv transferred this issue from NVIDIA/libcudacxx Nov 8, 2023

AustinSchuh mentioned this issue Nov 14, 2023

[FEA] Make cuco compilable using clang NVIDIA/cuCollections#128

Open

AustinSchuh mentioned this issue Nov 14, 2023

Fix compiler warnings when compiling with clang 17. NVIDIA/cuCollections#392

Merged

jrhemstad closed this as completed Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable clang to compile `<atomic>` in device code #1020

Enable clang to compile `<atomic>` in device code #1020

jrhemstad commented May 23, 2023 •

edited

Loading

wmaxey commented May 23, 2023

Artem-B commented May 24, 2023

jrhemstad commented May 24, 2023 •

edited

Loading

Artem-B commented May 24, 2023

AustinSchuh commented Nov 14, 2023

jrhemstad commented Nov 14, 2023 •

edited

Loading

AustinSchuh commented Nov 14, 2023 •

edited

Loading

jrhemstad commented Nov 14, 2023

AustinSchuh commented Nov 15, 2023

Enable clang to compile <atomic> in device code #1020

Enable clang to compile <atomic> in device code #1020

Comments

jrhemstad commented May 23, 2023 • edited Loading

wmaxey commented May 23, 2023

Artem-B commented May 24, 2023

jrhemstad commented May 24, 2023 • edited Loading

Artem-B commented May 24, 2023

AustinSchuh commented Nov 14, 2023

jrhemstad commented Nov 14, 2023 • edited Loading

AustinSchuh commented Nov 14, 2023 • edited Loading

jrhemstad commented Nov 14, 2023

AustinSchuh commented Nov 15, 2023

Enable clang to compile `<atomic>` in device code #1020

Enable clang to compile `<atomic>` in device code #1020

jrhemstad commented May 23, 2023 •

edited

Loading

jrhemstad commented May 24, 2023 •

edited

Loading

jrhemstad commented Nov 14, 2023 •

edited

Loading

AustinSchuh commented Nov 14, 2023 •

edited

Loading