-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Vertex finder with one kernel instead of 5 #413
Implement Vertex finder with one kernel instead of 5 #413
Conversation
Validation summaryReference release CMSSW_11_0_0_pre11 at 5b0a828 Validation plots/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW
/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW
/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW
Throughput plots/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53logs and
|
if (oneKernel_) { | ||
// implemented only for density clustesrs | ||
#ifdef ONE_KERNEL | ||
vertexFinderOneKernel<<<1, 1024 - 256, 0, stream>>>(soa, ws_d.get(), minT, eps, errmax, chi2max); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good motivation for the "one kernel" to be controlled both by #ifdef
and by configuration parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ONE_KERNEL controls one vs three as before ONE was done using dynamic parallelism (some as three but in its own kernel) that at the moment does not even link...
I still hope at some point to have the possibility to test dynamic parallelism
(so ok, everything can be done at config level)
I can remove the three kernel option, is just historical development at this point...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I was just curious. I'm fine with leaving it in, but maybe add a comment explaining ONE_KERNEL
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will cleanup and comment in next iteration.
@VinInn , the throughput actually shows somewhat of a slowdown. |
we can revert to 5 kernels (enough to change one line) |
OOOPSSS a bug. missing braces.... so run this morning in a new area using latest patatrack master
|
ok, need to rebase.... |
034e339
to
d0eec01
Compare
Validation summaryReference release CMSSW_11_0_0_pre11 at 5b0a828 Validation plots/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW
/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW
/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW
Throughput plots/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53logs and
|
After realizing that the number of vertex to split is small I implemented the whole sequence
finder, fitter,split,fitter,ptsort as a single kernel of ONE block.
to make things simple only for the Density clustering.
Other options are still available using configurable (or conditional compilation).
The quadruplet workflow is a couple of percent (one...) faster on T4.