-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CL] Make device queries more robust #2032
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Previously, non-standard device queries (e.g., USM) were submitted to OpenCL devices without checking whether the device supports the extension. As a result, an exception was thrown from sycl::get_info.
Before: $ sycl-ls --verbose
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.3.30049.600000]
[level_zero:gpu][level_zero:1] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.3.30049.600000]
[opencl:gpu][opencl:0] Clover, AMD Radeon RX 6400 (radeonsi, navi24, LLVM 15.0.7, DRM 3.57, 6.8.0-40-generic) OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA [24.2.0 - kisak-mesa PPA]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO [24.26.30049.6]
[opencl:gpu][opencl:2] Intel(R) OpenCL Graphics, Intel(R) UHD Graphics 770 OpenCL 3.0 NEO [24.26.30049.6]
[opencl:cpu][opencl:3] Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900K OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
[cuda:gpu][cuda:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 8.6 [CUDA 12.6]
[hip:gpu][hip:0] AMD HIP BACKEND, AMD Radeon RX 6400 gfx1034 [HIP 60140.9]
Platforms: 7
Platform [#1]:
Version : 1.3
Name : Intel(R) oneAPI Unified Runtime over Level-Zero
Vendor : Intel(R) Corporation
Devices : 2
Device [#0]:
Type : gpu
Version : 12.55.8
Name : Intel(R) Arc(TM) A770 Graphics
Vendor : Intel(R) Corporation
Driver : 1.3.30049.600000
UUID : 13412816086800030000000
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem
info::device::sub_group_sizes: 8 16 32
Architecture: intel_gpu_acm_g10
Device [#1]:
Type : gpu
Version : 12.2.0
Name : Intel(R) UHD Graphics 770
Vendor : Intel(R) Corporation
Driver : 1.3.30049.600000
UUID : 134128128701200002000000
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem
info::device::sub_group_sizes: 8 16 32
Architecture: intel_gpu_adl_s
Platform [#2]:
Version : OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA
Name : Clover
Vendor : Mesa
Devices : 1
Device [#0]:
Type : gpu
Version : OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA
Name : AMD Radeon RX 6400 (radeonsi, navi24, LLVM 15.0.7, DRM 3.57, 6.8.0-40-generic)
Vendor : AMD
Driver : 24.2.0 - kisak-mesa PPA
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp64 online_compiler online_linker queue_profilingSYCL Exception encountered: Native API failed. Native API returns: 4 (UR_RESULT_ERROR_INVALID_VALUE) After: $ sycl-ls --verbose
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.3.30049.600000]
[level_zero:gpu][level_zero:1] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.3.30049.600000]
[opencl:gpu][opencl:0] Clover, AMD Radeon RX 6400 (radeonsi, navi24, LLVM 15.0.7, DRM 3.57, 6.8.0-40-generic) OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA [24.2.0 - kisak-mesa PPA]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO [24.26.30049.6]
[opencl:gpu][opencl:2] Intel(R) OpenCL Graphics, Intel(R) UHD Graphics 770 OpenCL 3.0 NEO [24.26.30049.6]
[opencl:cpu][opencl:3] Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900K OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
[cuda:gpu][cuda:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 8.6 [CUDA 12.6]
[hip:gpu][hip:0] AMD HIP BACKEND, AMD Radeon RX 6400 gfx1034 [HIP 60140.9]
Platforms: 7
Platform [#1]:
Version : 1.3
Name : Intel(R) oneAPI Unified Runtime over Level-Zero
Vendor : Intel(R) Corporation
Devices : 2
Device [#0]:
Type : gpu
Version : 12.55.8
Name : Intel(R) Arc(TM) A770 Graphics
Vendor : Intel(R) Corporation
Driver : 1.3.30049.600000
UUID : 13412816086800030000000
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem
info::device::sub_group_sizes: 8 16 32
Architecture: intel_gpu_acm_g10
Device [#1]:
Type : gpu
Version : 12.2.0
Name : Intel(R) UHD Graphics 770
Vendor : Intel(R) Corporation
Driver : 1.3.30049.600000
UUID : 134128128701200002000000
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem
info::device::sub_group_sizes: 8 16 32
Architecture: intel_gpu_adl_s
Platform [#2]:
Version : OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA
Name : Clover
Vendor : Mesa
Devices : 1
Device [#0]:
Type : gpu
Version : OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA
Name : AMD Radeon RX 6400 (radeonsi, navi24, LLVM 15.0.7, DRM 3.57, 6.8.0-40-generic)
Vendor : AMD
Driver : 24.2.0 - kisak-mesa PPA
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp64 online_compiler online_linker queue_profiling atomic64 ext_oneapi_srgb ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_private_alloca
info::device::sub_group_sizes: 1
Architecture: unknown
Platform [#3]:
Version : OpenCL 3.0
Name : Intel(R) OpenCL Graphics
Vendor : Intel(R) Corporation
Devices : 1
Device [#1]:
Type : gpu
Version : OpenCL 3.0 NEO
Name : Intel(R) Arc(TM) A770 Graphics
Vendor : Intel(R) Corporation
Driver : 24.26.30049.6
UUID : 13412816086800030000000
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_srgb ext_intel_device_id ext_intel_legacy_image ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_private_alloca
info::device::sub_group_sizes: 8 16 32
Architecture: intel_gpu_acm_g10
Platform [#4]:
Version : OpenCL 3.0
Name : Intel(R) OpenCL Graphics
Vendor : Intel(R) Corporation
Devices : 1
Device [#2]:
Type : gpu
Version : OpenCL 3.0 NEO
Name : Intel(R) UHD Graphics 770
Vendor : Intel(R) Corporation
Driver : 24.26.30049.6
UUID : 134128128701200002000000
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_srgb ext_intel_device_id ext_intel_legacy_image ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_private_alloca
info::device::sub_group_sizes: 8 16 32
Architecture: intel_gpu_adl_s
Platform [#5]:
Version : OpenCL 3.0 LINUX
Name : Intel(R) OpenCL
Vendor : Intel(R) Corporation
Devices : 1
Device [#3]:
Type : cpu
Version : OpenCL 3.0 (Build 0)
Name : 12th Gen Intel(R) Core(TM) i9-12900K
Vendor : Intel(R) Corporation
Driver : 2024.18.7.0.11_160000
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : cpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_system_allocations usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_oneapi_srgb ext_oneapi_native_assert ext_intel_legacy_image ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_private_alloca
info::device::sub_group_sizes: 4 8 16 32 64
Architecture: x86_64
Platform [#6]:
Version : CUDA 12.6
Name : NVIDIA CUDA BACKEND
Vendor : NVIDIA Corporation
Devices : 1
Device [#0]:
Type : gpu
Version : 8.6
Name : NVIDIA GeForce RTX 3060
Vendor : NVIDIA Corporation
Driver : CUDA 12.6
UUID : 115971881914139240342461923712712616593242
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address usm_atomic_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_cuda_async_barrier ext_intel_free_memory ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_widthImages are not fully supported by the CUDA BE, their support is disabled by default. Their partial support can be activated by setting SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT environment variable at runtime.
ext_oneapi_bindless_images ext_oneapi_bindless_images_shared_usm ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_oneapi_external_memory_import ext_oneapi_external_semaphore_import ext_oneapi_mipmap ext_oneapi_mipmap_anisotropy ext_oneapi_mipmap_level_reference ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_graph ext_oneapi_limited_graph ext_oneapi_cubemap ext_oneapi_cubemap_seamless_filtering ext_oneapi_bindless_sampled_image_fetch_1d_usm ext_oneapi_bindless_sampled_image_fetch_2d_usm ext_oneapi_bindless_sampled_image_fetch_2d ext_oneapi_bindless_sampled_image_fetch_3d ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem ext_oneapi_image_array ext_oneapi_unique_addressing_per_dim ext_oneapi_bindless_images_sample_1d_usm ext_oneapi_bindless_images_sample_2d_usm
info::device::sub_group_sizes: 32
Architecture: nvidia_gpu_sm_86
Platform [#7]:
Version : HIP 60140.9
Name : AMD HIP BACKEND
Vendor : AMD Corporation
Devices : 1
Device [#0]:
Type : gpu
Version : gfx1034
Name : AMD Radeon RX 6400
Vendor : AMD Corporation
Driver : HIP 60140.9
UUID : 888800000000000000
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_intel_device_info_uuid ext_intel_free_memory ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_graph ext_oneapi_limited_graph ext_oneapi_queue_profiling_tag
info::device::sub_group_sizes: 32
Architecture: amd_gpu_gfx1034
default_selector() : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.3.30049.600000]
accelerator_selector() : No device of requested type available.
cpu_selector() : cpu, Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900K OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
gpu_selector() : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.3.30049.600000]
custom_selector(gpu) : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.3.30049.600000]
custom_selector(cpu) : cpu, Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900K OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
custom_selector(acc) : No device of requested type available. |
kbenzie
reviewed
Aug 29, 2024
kbenzie
approved these changes
Aug 30, 2024
@al42and Could you please create intel/llvm PR for this to merge it? Thanks! |
al42and
added a commit
to al42and/llvm
that referenced
this pull request
Sep 4, 2024
@omarahmed1111: Forgot about that. Done: intel/llvm#15286 |
omarahmed1111
pushed a commit
to al42and/llvm
that referenced
this pull request
Sep 5, 2024
omarahmed1111
pushed a commit
to al42and/llvm
that referenced
this pull request
Sep 5, 2024
omarahmed1111
added a commit
to omarahmed1111/unified-runtime
that referenced
this pull request
Sep 10, 2024
[CL] Make device queries more robust
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, non-standard device queries (e.g., USM) were submitted to OpenCL devices without checking whether the device supports the extension. As a result, an exception was thrown from sycl::get_info.
Now, we more gracefully handle the case when OpenCL device does not have the extension needed:
UR_DEVICE_INFO_USM_*_SUPPORT
: Return all-false when the device does not support USM at all.UR_DEVICE_INFO_SUB_GROUP_SIZES_INTEL
: Return{1}
, like for the host device which also has no sub-groups.UR_DEVICE_INFO_IP_VERSION
: ReturnUR_RESULT_ERROR_UNSUPPORTED_ENUMERATION
instead ofUR_RESULT_ERROR_INVALID_VALUE
.Also, fix the description of
SUB_GROUP_SIZES_INTEL
info query, since UR supports it on NVIDIA and AMD GPUs too.