Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CL] Make device queries more robust #2032

Merged
merged 2 commits into from
Sep 5, 2024

Conversation

al42and
Copy link
Contributor

@al42and al42and commented Aug 29, 2024

Previously, non-standard device queries (e.g., USM) were submitted to OpenCL devices without checking whether the device supports the extension. As a result, an exception was thrown from sycl::get_info.

Now, we more gracefully handle the case when OpenCL device does not have the extension needed:

  • UR_DEVICE_INFO_USM_*_SUPPORT: Return all-false when the device does not support USM at all.
  • UR_DEVICE_INFO_SUB_GROUP_SIZES_INTEL: Return {1}, like for the host device which also has no sub-groups.
  • UR_DEVICE_INFO_IP_VERSION: Return UR_RESULT_ERROR_UNSUPPORTED_ENUMERATION instead of UR_RESULT_ERROR_INVALID_VALUE.

Also, fix the description of SUB_GROUP_SIZES_INTEL info query, since UR supports it on NVIDIA and AMD GPUs too.

Previously, non-standard device queries (e.g., USM) were submitted to
OpenCL devices without checking whether the device supports the
extension. As a result, an exception was thrown from sycl::get_info.
@al42and al42and requested review from a team as code owners August 29, 2024 15:55
@al42and
Copy link
Contributor Author

al42and commented Aug 29, 2024

Before:

$ sycl-ls --verbose
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.3.30049.600000]
[level_zero:gpu][level_zero:1] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.3.30049.600000]
[opencl:gpu][opencl:0] Clover, AMD Radeon RX 6400 (radeonsi, navi24, LLVM 15.0.7, DRM 3.57, 6.8.0-40-generic) OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA [24.2.0 - kisak-mesa PPA]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO  [24.26.30049.6]
[opencl:gpu][opencl:2] Intel(R) OpenCL Graphics, Intel(R) UHD Graphics 770 OpenCL 3.0 NEO  [24.26.30049.6]
[opencl:cpu][opencl:3] Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900K OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
[cuda:gpu][cuda:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 8.6 [CUDA 12.6]
[hip:gpu][hip:0] AMD HIP BACKEND, AMD Radeon RX 6400 gfx1034 [HIP 60140.9]

Platforms: 7
Platform [#1]:
    Version  : 1.3
    Name     : Intel(R) oneAPI Unified Runtime over Level-Zero
    Vendor   : Intel(R) Corporation
    Devices  : 2
        Device [#0]:
        Type              : gpu
        Version           : 12.55.8
        Name              : Intel(R) Arc(TM) A770 Graphics
        Vendor            : Intel(R) Corporation
        Driver            : 1.3.30049.600000
        UUID              : 13412816086800030000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem
        info::device::sub_group_sizes: 8 16 32
        Architecture: intel_gpu_acm_g10
        Device [#1]:
        Type              : gpu
        Version           : 12.2.0
        Name              : Intel(R) UHD Graphics 770
        Vendor            : Intel(R) Corporation
        Driver            : 1.3.30049.600000
        UUID              : 134128128701200002000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem
        info::device::sub_group_sizes: 8 16 32
        Architecture: intel_gpu_adl_s
Platform [#2]:
    Version  : OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA
    Name     : Clover
    Vendor   : Mesa
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA
        Name              : AMD Radeon RX 6400 (radeonsi, navi24, LLVM 15.0.7, DRM 3.57, 6.8.0-40-generic)
        Vendor            : AMD
        Driver            : 24.2.0 - kisak-mesa PPA
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp64 online_compiler online_linker queue_profilingSYCL Exception encountered: Native API failed. Native API returns: 4 (UR_RESULT_ERROR_INVALID_VALUE)

After:

$ sycl-ls --verbose
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.3.30049.600000]
[level_zero:gpu][level_zero:1] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.3.30049.600000]
[opencl:gpu][opencl:0] Clover, AMD Radeon RX 6400 (radeonsi, navi24, LLVM 15.0.7, DRM 3.57, 6.8.0-40-generic) OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA [24.2.0 - kisak-mesa PPA]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO  [24.26.30049.6]
[opencl:gpu][opencl:2] Intel(R) OpenCL Graphics, Intel(R) UHD Graphics 770 OpenCL 3.0 NEO  [24.26.30049.6]
[opencl:cpu][opencl:3] Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900K OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
[cuda:gpu][cuda:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 8.6 [CUDA 12.6]
[hip:gpu][hip:0] AMD HIP BACKEND, AMD Radeon RX 6400 gfx1034 [HIP 60140.9]

Platforms: 7
Platform [#1]:
    Version  : 1.3
    Name     : Intel(R) oneAPI Unified Runtime over Level-Zero
    Vendor   : Intel(R) Corporation
    Devices  : 2
        Device [#0]:
        Type              : gpu
        Version           : 12.55.8
        Name              : Intel(R) Arc(TM) A770 Graphics
        Vendor            : Intel(R) Corporation
        Driver            : 1.3.30049.600000
        UUID              : 13412816086800030000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem
        info::device::sub_group_sizes: 8 16 32
        Architecture: intel_gpu_acm_g10
        Device [#1]:
        Type              : gpu
        Version           : 12.2.0
        Name              : Intel(R) UHD Graphics 770
        Vendor            : Intel(R) Corporation
        Driver            : 1.3.30049.600000
        UUID              : 134128128701200002000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem
        info::device::sub_group_sizes: 8 16 32
        Architecture: intel_gpu_adl_s
Platform [#2]:
    Version  : OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA
    Name     : Clover
    Vendor   : Mesa
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : OpenCL 1.1 Mesa 24.2.0 - kisak-mesa PPA
        Name              : AMD Radeon RX 6400 (radeonsi, navi24, LLVM 15.0.7, DRM 3.57, 6.8.0-40-generic)
        Vendor            : AMD
        Driver            : 24.2.0 - kisak-mesa PPA
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp64 online_compiler online_linker queue_profiling atomic64 ext_oneapi_srgb ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_private_alloca
        info::device::sub_group_sizes: 1
        Architecture: unknown
Platform [#3]:
    Version  : OpenCL 3.0 
    Name     : Intel(R) OpenCL Graphics
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#1]:
        Type              : gpu
        Version           : OpenCL 3.0 NEO 
        Name              : Intel(R) Arc(TM) A770 Graphics
        Vendor            : Intel(R) Corporation
        Driver            : 24.26.30049.6
        UUID              : 13412816086800030000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_srgb ext_intel_device_id ext_intel_legacy_image ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_private_alloca
        info::device::sub_group_sizes: 8 16 32
        Architecture: intel_gpu_acm_g10
Platform [#4]:
    Version  : OpenCL 3.0 
    Name     : Intel(R) OpenCL Graphics
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#2]:
        Type              : gpu
        Version           : OpenCL 3.0 NEO 
        Name              : Intel(R) UHD Graphics 770
        Vendor            : Intel(R) Corporation
        Driver            : 24.26.30049.6
        UUID              : 134128128701200002000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_srgb ext_intel_device_id ext_intel_legacy_image ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_private_alloca
        info::device::sub_group_sizes: 8 16 32
        Architecture: intel_gpu_adl_s
Platform [#5]:
    Version  : OpenCL 3.0 LINUX
    Name     : Intel(R) OpenCL
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#3]:
        Type              : cpu
        Version           : OpenCL 3.0 (Build 0)
        Name              : 12th Gen Intel(R) Core(TM) i9-12900K
        Vendor            : Intel(R) Corporation
        Driver            : 2024.18.7.0.11_160000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : cpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_system_allocations usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_oneapi_srgb ext_oneapi_native_assert ext_intel_legacy_image ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_private_alloca
        info::device::sub_group_sizes: 4 8 16 32 64
        Architecture: x86_64
Platform [#6]:
    Version  : CUDA 12.6
    Name     : NVIDIA CUDA BACKEND
    Vendor   : NVIDIA Corporation
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : 8.6
        Name              : NVIDIA GeForce RTX 3060
        Vendor            : NVIDIA Corporation
        Driver            : CUDA 12.6
        UUID              : 115971881914139240342461923712712616593242
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address usm_atomic_shared_allocations atomic64 ext_intel_device_info_uuid ext_oneapi_cuda_async_barrier ext_intel_free_memory ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_widthImages are not fully supported by the CUDA BE, their support is disabled by default. Their partial support can be activated by setting SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT environment variable at runtime.
 ext_oneapi_bindless_images ext_oneapi_bindless_images_shared_usm ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_oneapi_external_memory_import ext_oneapi_external_semaphore_import ext_oneapi_mipmap ext_oneapi_mipmap_anisotropy ext_oneapi_mipmap_level_reference ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_graph ext_oneapi_limited_graph ext_oneapi_cubemap ext_oneapi_cubemap_seamless_filtering ext_oneapi_bindless_sampled_image_fetch_1d_usm ext_oneapi_bindless_sampled_image_fetch_2d_usm ext_oneapi_bindless_sampled_image_fetch_2d ext_oneapi_bindless_sampled_image_fetch_3d ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem ext_oneapi_image_array ext_oneapi_unique_addressing_per_dim ext_oneapi_bindless_images_sample_1d_usm ext_oneapi_bindless_images_sample_2d_usm
        info::device::sub_group_sizes: 32
        Architecture: nvidia_gpu_sm_86
Platform [#7]:
    Version  : HIP 60140.9
    Name     : AMD HIP BACKEND
    Vendor   : AMD Corporation
    Devices  : 1
        Device [#0]:
        Type              : gpu
        Version           : gfx1034
        Name              : AMD Radeon RX 6400
        Vendor            : AMD Corporation
        Driver            : HIP 60140.9
        UUID              : 888800000000000000
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : gpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_intel_device_info_uuid ext_intel_free_memory ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_graph ext_oneapi_limited_graph ext_oneapi_queue_profiling_tag
        info::device::sub_group_sizes: 32
        Architecture: amd_gpu_gfx1034
default_selector()      : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.3.30049.600000]
accelerator_selector()  : No device of requested type available.
cpu_selector()          : cpu, Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900K OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
gpu_selector()          : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.3.30049.600000]
custom_selector(gpu)    : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) A770 Graphics 12.55.8 [1.3.30049.600000]
custom_selector(cpu)    : cpu, Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900K OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
custom_selector(acc)    : No device of requested type available.

@kbenzie kbenzie added the v0.10.x Include in the v0.10.x release label Aug 30, 2024
@omarahmed1111
Copy link
Contributor

@al42and Could you please create intel/llvm PR for this to merge it? Thanks!

al42and added a commit to al42and/llvm that referenced this pull request Sep 4, 2024
@al42and
Copy link
Contributor Author

al42and commented Sep 4, 2024

@omarahmed1111: Forgot about that. Done: intel/llvm#15286

omarahmed1111 pushed a commit to al42and/llvm that referenced this pull request Sep 5, 2024
@omarahmed1111 omarahmed1111 merged commit e1d0da8 into oneapi-src:main Sep 5, 2024
59 of 72 checks passed
omarahmed1111 pushed a commit to al42and/llvm that referenced this pull request Sep 5, 2024
@al42and al42and deleted the more-robust-opencl branch September 5, 2024 13:56
omarahmed1111 added a commit to omarahmed1111/unified-runtime that referenced this pull request Sep 10, 2024
@kbenzie kbenzie removed the v0.10.x Include in the v0.10.x release label Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready to merge Added to PR's which are ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants