Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GCC13] Missing intrinsics 2 #44645

Closed
iarspider opened this issue Apr 6, 2024 · 23 comments
Closed

[GCC13] Missing intrinsics 2 #44645

iarspider opened this issue Apr 6, 2024 · 23 comments

Comments

@iarspider
Copy link
Contributor

When testing fixes for gcc13, two more cases of identifier "__builtin_ia32_cmpccxadd" is undefined were detected:

>> Compiling alpaka/cuda src/RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducerKernel.dev.cc
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02831/el9_amd64_gcc13/external/gcc/13.2.0-1b0a3367d04f48f01ad3ccf40e55475c/lib/gcc/x86_64-redhat-linux-gnu/13.2.0/include/cmpccxaddintrin.h(63): error: identifier "__builtin_ia32_cmpccxadd" is undefined
    return __builtin_ia32_cmpccxadd (__A, __B, __C, __D);
           ^

/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02831/el9_amd64_gcc13/external/gcc/13.2.0-1b0a3367d04f48f01ad3ccf40e55475c/lib/gcc/x86_64-redhat-linux-gnu/13.2.0/include/cmpccxaddintrin.h(71): error: identifier "__builtin_ia32_cmpccxadd64" is undefined
    return __builtin_ia32_cmpccxadd64 (__A, __B, __C, __D);
           ^

2 errors detected in the compilation of "src/RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducerKernel.dev.cc".

and

>> Compiling alpaka/cuda edm plugin src/RecoParticleFlow/PFRecHitProducer/plugins/alpaka/PFRecHitTopologyESProducer.cc
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02831/el9_amd64_gcc13/external/gcc/13.2.0-1b0a3367d04f48f01ad3ccf40e55475c/lib/gcc/x86_64-redhat-linux-gnu/13.2.0/include/cmpccxaddintrin.h(63): error: identifier "__builtin_ia32_cmpccxadd" is undefined
    return __builtin_ia32_cmpccxadd (__A, __B, __C, __D);
           ^

/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02831/el9_amd64_gcc13/external/gcc/13.2.0-1b0a3367d04f48f01ad3ccf40e55475c/lib/gcc/x86_64-redhat-linux-gnu/13.2.0/include/cmpccxaddintrin.h(71): error: identifier "__builtin_ia32_cmpccxadd64" is undefined
    return __builtin_ia32_cmpccxadd64 (__A, __B, __C, __D);
           ^

2 errors detected in the compilation of "src/RecoParticleFlow/PFRecHitProducer/plugins/alpaka/PFRecHitProducerKernel.dev.cc".

Similar problem was fixed in #44595 by @makortel and @dan131riley .

@iarspider
Copy link
Contributor Author

assign heterogeneous,core

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2024

New categories assigned: heterogeneous,core

@Dr15Jones,@fwyzard,@makortel,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2024

A new Issue was created by @iarspider.

@sextonkennedy, @smuzaffar, @Dr15Jones, @rappoccio, @makortel, @antoniovilela can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@fwyzard
Copy link
Contributor

fwyzard commented Apr 8, 2024

According to the release notes, __builtin_ia32_cmpccxadd and __builtin_ia32_cmpccxadd64 should now be supported by CUDA 12.4.1 .

@fwyzard
Copy link
Contributor

fwyzard commented Apr 8, 2024

See cms-sw/cmsdist#9126 .

@iarspider
Copy link
Contributor Author

@fwyzard thanks!

In case that doesn't fix the issue, here is include chain that leads to tbb:

RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducerKernel.dev.cc
RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducerKernel.h
RecoParticleFlow/PFRecHitProducer/interface/PFRecHitTopologyRecord.h
Geometry/Records/interface/CaloGeometryRecord.h
FWCore/Framework/interface/EventSetupRecordImplementation.h
FWCore/Framework/interface/EventSetupRecord.h
FWCore/Framework/interface/EventSetupRecordImpl.h
FWCore/Concurrency/interface/WaitingTaskHolder.h
oneapi/tbb/task_group.h

@fwyzard
Copy link
Contributor

fwyzard commented Apr 8, 2024

thanks @iarspider, we should probably try to fix the include chain anyway.

@fwyzard
Copy link
Contributor

fwyzard commented Apr 8, 2024

Moving

#include "RecoParticleFlow/PFRecHitProducer/interface/PFRecHitTopologyRecord.h"

from RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducerKernel.h to RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducer.cc should fix the issue.

@iarspider
Copy link
Contributor Author

Moving

#include "RecoParticleFlow/PFRecHitProducer/interface/PFRecHitTopologyRecord.h"

from RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducerKernel.h to RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducer.cc should fix the issue.

Thanks! Could you please open a PR with this change?

@fwyzard
Copy link
Contributor

fwyzard commented Apr 8, 2024

#44656

@iarspider
Copy link
Contributor Author

iarspider commented Apr 9, 2024

@fwyzard looks like there are more cases like this: 1, 2. I will check the include chains for these two.

@fwyzard
Copy link
Contributor

fwyzard commented Apr 9, 2024

@makortel , if we want to avoid compiling (some of) the framework headers with nvcc, would it make sense to add something like

#ifdef __NVCC__
#error Framework headers should not be compiled by the device compiler
#endif

to FWCore/Concurrency/interface/WaitingTaskHolder.h and similar headers ?

@iarspider
Copy link
Contributor Author

iarspider commented Apr 9, 2024

For RecoParticleFlow/PFRecHitProducer/plugins/alpaka/PFRecHitTopologyESProducer.cc: list - a bit long this time, since I've traced all paths (but can also be not 100% accurate if some includes are guarded with #ifdef).

@iarspider
Copy link
Contributor Author

iarspider commented Apr 9, 2024

For RecoLocalTracker/SiStripClusterizer/plugins/SiStripRawToClusterGPUKernel.cu:

RecoLocalTracker/SiStripClusterizer/plugins/SiStripRawToClusterGPUKernel.cu -> RecoLocalTracker/SiStripClusterizer/plugins/SiStripRawToClusterGPUKernel.h -> CalibFormats/SiStripObjects/interface/SiStripClusterizerConditionsGPU.h -> HeterogeneousCore/CUDACore/interface/ESProduct.h -> HeterogeneousCore/CUDAUtilities/interface/EventCache.h -> FWCore/Utilities/interface/ReusableObjectHolder.h -> oneapi/tbb/concurrent_queue.h
RecoLocalTracker/SiStripClusterizer/plugins/SiStripRawToClusterGPUKernel.cu -> CalibFormats/SiStripObjects/interface/SiStripClusterizerConditionsGPU.h -> HeterogeneousCore/CUDACore/interface/ESProduct.h -> HeterogeneousCore/CUDAUtilities/interface/EventCache.h -> FWCore/Utilities/interface/ReusableObjectHolder.h -> oneapi/tbb/concurrent_queue.h

(this is the only tbb header used by this file, the PFRecHitTopologyESProducer.cc referenced several other tbb headers)

@fwyzard
Copy link
Contributor

fwyzard commented Apr 9, 2024

For RecoParticleFlow/PFRecHitProducer/plugins/alpaka/PFRecHitTopologyESProducer.cc: list - a bit long this time, since I've traced all paths (but can also be not 100% accurate if some includes are guarded with #ifdef).

RecoParticleFlow/PFRecHitProducer/plugins/alpaka/PFRecHitTopologyESProducer.cc should not be compiled by nvcc.

@fwyzard
Copy link
Contributor

fwyzard commented Apr 9, 2024

For RecoLocalTracker/SiStripClusterizer/plugins/SiStripRawToClusterGPUKernel.cu:

RecoLocalTracker/SiStripClusterizer/plugins/SiStripRawToClusterGPUKernel.cu -> RecoLocalTracker/SiStripClusterizer/plugins/SiStripRawToClusterGPUKernel.h -> CalibFormats/SiStripObjects/interface/SiStripClusterizerConditionsGPU.h -> HeterogeneousCore/CUDACore/interface/ESProduct.h -> HeterogeneousCore/CUDAUtilities/interface/EventCache.h -> FWCore/Utilities/interface/ReusableObjectHolder.h -> oneapi/tbb/concurrent_queue.h
RecoLocalTracker/SiStripClusterizer/plugins/SiStripRawToClusterGPUKernel.cu -> CalibFormats/SiStripObjects/interface/SiStripClusterizerConditionsGPU.h -> HeterogeneousCore/CUDACore/interface/ESProduct.h -> HeterogeneousCore/CUDAUtilities/interface/EventCache.h -> FWCore/Utilities/interface/ReusableObjectHolder.h -> oneapi/tbb/concurrent_queue.h

(this is the only tbb header used by this file, the PFRecHitTopologyESProducer.cc referenced several other tbb headers)

Unfortunately this does not seem trivial to disentangle.

@iarspider
Copy link
Contributor Author

iarspider commented Apr 9, 2024 via email

@iarspider
Copy link
Contributor Author

Corrected dependency chain for PFRecHitProducerKernel.dev.cc: https://gist.github.com/iarspider/f655770228e41eebbf38618f836870bd . All chains stem from RecoParticleFlow/PFRecHitProducer/interface/PFRecHitTopologyRecord.h and RecoParticleFlow/PFRecHitProducer/interface/PFRecHitParamsRecord.h

@fwyzard
Copy link
Contributor

fwyzard commented Apr 9, 2024

Can you try to edit RecoParticleFlow/PFRecHitProducer/plugins/alpaka/CalorimeterDefinitions.h and replace the include of the differente Records with forward declarations ?

@dan131riley
Copy link

@makortel , if we want to avoid compiling (some of) the framework headers with nvcc, would it make sense to add something like

#ifdef __NVCC__
#error Framework headers should not be compiled by the device compiler
#endif

to FWCore/Concurrency/interface/WaitingTaskHolder.h and similar headers ?

Trying this in 14_1_0_pre2, I get hits in

                 from src/RecoLocalCalo/HcalRecProducers/src/MahiGPU.cu:11:
                 from src/RecoParticleFlow/PFRecHitProducer/plugins/alpaka/PFRecHitProducerKernel.dev.cc:8:
                 from src/RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducerKernel.dev.cc:9:

Full chains:

                 from /mnt/data1/dsr/tmp/CMSSW_14_1_0_pre2/src/FWCore/Framework/interface/DependentRecordImplementation.h:26,
                 from /mnt/data1/dsr/tmp/CMSSW_14_1_0_pre2/src/CondFormats/DataRecord/interface/HcalPedestalWidthsRcd.h:5,
                 from /mnt/data1/dsr/tmp/CMSSW_14_1_0_pre2/src/CondFormats/DataRecord/interface/HcalCombinedRecordsGPU.h:4,
                 from src/RecoLocalCalo/HcalRecProducers/src/DeclsForKernels.h:11,
                 from src/RecoLocalCalo/HcalRecProducers/src/SimpleAlgoGPU.h:4,
                 from src/RecoLocalCalo/HcalRecProducers/src/MahiGPU.cu:11:

                 from /mnt/data1/dsr/tmp/CMSSW_14_1_0_pre2/src/RecoParticleFlow/PFRecHitProducer/interface/PFRecHitParamsRecord.h:4,
                 from src/RecoParticleFlow/PFRecHitProducer/plugins/alpaka/CalorimeterDefinitions.h:12,
                 from src/RecoParticleFlow/PFRecHitProducer/plugins/alpaka/PFRecHitProducerKernel.h:6,
                 from src/RecoParticleFlow/PFRecHitProducer/plugins/alpaka/PFRecHitProducerKernel.dev.cc:8:

                 from /mnt/data1/dsr/tmp/CMSSW_14_1_0_pre2/src/FWCore/Framework/interface/DependentRecordImplementation.h:26,
                 from /mnt/data1/dsr/tmp/CMSSW_14_1_0_pre2/src/RecoParticleFlow/PFRecHitProducer/interface/PFRecHitTopologyRecord.h:4,
                 from /mnt/data1/dsr/tmp/CMSSW_14_1_0_pre2/src/RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducerKernel.h:11,
                 from src/RecoParticleFlow/PFClusterProducer/plugins/alpaka/PFClusterSoAProducerKernel.dev.cc:9:

@fwyzard
Copy link
Contributor

fwyzard commented Apr 24, 2024

either #44682 or #44828 should take care of the alpaka ParticleFlow ones.

@iarspider
Copy link
Contributor Author

@cmsbuild please close

@cmsbuild cmsbuild closed this as completed May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants