Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infrastructure for using channel-dependent pulse shapes for HB/HE #45995

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

igv4321
Copy link
Contributor

@igv4321 igv4321 commented Sep 12, 2024

PR description:

Adding infrastructure for modeling Hcal front-end pulse shapes channel-by-channel. The pulse shapes were determined by phase scans, as described in https://indico.cern.ch/event/1336438/contributions/5626382/attachments/2733633/4753627/HBHE_Pulse_shape_study.pdf. The new pulse shapes have not yet been turned on, pending final timing adjustment.

PR validation:

The usual matrix text were run. Also, privately, the new pulse shapes were switched on (parameter "useChannelShapes" in HBHEPhase1Reconstructor_cfi.py was set True), and the code appears to produce reasonable results on the 2023 data workflow, as indicated by the subsequent Hcal rechit dump.

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 12, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45995/41766

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45995/41767

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @igv4321 for master.

It involves the following packages:

  • CalibCalorimetry/HcalAlgos (alca)
  • DataFormats/HcalRecHit (reconstruction)
  • RecoLocalCalo/Configuration (reconstruction)
  • RecoLocalCalo/HcalRecAlgos (reconstruction)
  • RecoLocalCalo/HcalRecProducers (reconstruction)

@cmsbuild, @consuegs, @jfernan2, @mandrenguyen, @perrotta, @saumyaphor4252 can you please review it and eventually sign? Thanks.
@ReyerBand, @abdoulline, @apsallid, @argiro, @bsunanda, @mariadalfonso, @missirol, @mmusich, @rchatter, @rovere, @rsreds, @sameasy, @thomreis, @tocheng, @wang0jin, @youyingli, @yuanchao this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@jfernan2
Copy link
Contributor

please test

# indicating whether we are using it
channelShapesLabel = cms.string("HcalDataShapes"),
useChannelShapes = cms.bool(False),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I'm not a reviewer, I just have a general question on these developments.

What is their target (if there is one) for going in production ? 2025 data-taking ?

Are these changes only needed for the HCAL offline reconstruction, or are they also supposed to be used at HLT ?

Ever since the heterogeneous version of the HCAL local reconstruction was ported to Alpaka (#44910), the plugin HBHEPhase1Reconstructor is not used anymore in the Run-3 HLT menus (in other words, HLT is blind to this PR, as far as I understand).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be useful for 2023 and 2024 re-reco as well as for the future data taking. Unfortunately, at the moment we don't have an alpaka expert in the HCAL DPG, so there is no concrete plan how to get this to work in the HLT. Of course, help from Martin Kwok (or any other alpaka expert) would be appreciated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification.

Notifying @kakwok, @fwyzard and @cms-sw/hlt-l2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say that this

for the future data taking

combined with

there is no concrete plan how to get this to work in the HLT.

sounds to me like the HCAL DPG is not taking full ownership of the HCAL code running in production today in the HLT. I don't know the quantitative impact of these channel-dependent pulse shapes, but I would assume that we (well, HCAL specifically) should make an effort to keep HLT and offline as close as possible to each other.

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0a1618/41506/summary.html
COMMIT: 3c0b14f
CMSSW: CMSSW_14_2_X_2024-09-12-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/45995/41506/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 3 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 44
  • DQMHistoTests: Total histograms compared: 3331158
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3331135
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 43 files compared)
  • Checked 193 log files, 163 edm output root files, 44 DQM output files
  • TriggerResults: no differences found

label = cms.string('default'),
t0 = cms.uint32(0),
pulse = cms.vdouble(
5.22174e-12, 7.04852e-10, 3.49584e-08, 7.78029e-07, 9.11847e-06,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this large amount of parameters be better stored in the database, or at least in some external cms-data repository? This makes this file ~ unreadable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was discussed with the Hcal calibration group. However, this is the first time we are introducing this channel-by-channel dependence since the beginning of CMS operations, and the pulse shapes are not expected to change more often than once per era, so it was decided to keep the information in a config file. While this file is large (and, of course, computer-generated), its structure is rather simple, as can be seen from the "fillDescriptions" function of the HcalPulseShapesEP.cc code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is not acceptable.

Please move the pulse shapes either to a text file in an external cms-data package, or to a database payload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please substantiate your statement? Haven't seen anything against this approach in CMSSW guidelines and can't understand why it can be harmful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you seen many other modules with a configuration that takes twelve thousand lines of python code ?

Co-authored-by: Andrea Bocci <fwyzard@gmail.com>
@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45995/41792

@cmsbuild
Copy link
Contributor

Pull request #45995 was updated. @cmsbuild, @consuegs, @jfernan2, @mandrenguyen, @perrotta, @saumyaphor4252 can you please check and sign again.

@fwyzard
Copy link
Contributor

fwyzard commented Sep 14, 2024

assign alca

Please review the misuse of python configurations for detector calibrations, e.g. in RecoLocalCalo/HcalRecAlgos/python/hcalPulseShapesESProd_cfi.py.

@igv4321
Copy link
Contributor Author

igv4321 commented Sep 14, 2024

Perhaps, I should point out that these are not "detector calibrations" in the usual sense of the word. Pulse shapes can change, but the expected frequency is about the same as the frequency with which the detector composition is modified (once per shutdown), or front end electronics and/or firmware are swapped. The "eras" were invented precisely to permit such infrequent modifications in a reasonable way.

@fwyzard
Copy link
Contributor

fwyzard commented Sep 14, 2024

Perhaps, I should point out that these are not "detector calibrations" in the usual sense of the word. Pulse shapes can change, but the expected frequency is about the same as the frequency with which the detector composition is modified (once per shutdown), or front end electronics and/or firmware are swapped. The "eras" were invented precisely to permit such infrequent modifications in a reasonable way.

Perfect, so they can be put in a database object that will not need to be updated frequently.

@igv4321
Copy link
Contributor Author

igv4321 commented Sep 14, 2024

Yes, a database object can be developed, but that requires an extra level of complexity. What are the advantages of it?

@missirol
Copy link
Contributor

What are the advantages of it?

Some examples.

  • Not tying these conditions to a specific CMSSW release and/or HLT menu.

  • (Assuming this would be eventually used at HLT) Not adding O(13k) lines of python code to every HLT menu that needs to be deployed at P5 for a single set of HCAL calibration constants (not to mention that, if this was added in this form in the HLT menu, it would then clog ConfDB, i.e. the db where HLT menus are stored, and it would get copied over hundreds of development menus created by users across CMS in the months ahead).

  • Making it (arguably) easier to maintain and inspect them via standard conditions-db tools.

  • Making it easier to know when they were deployed online.

  • Making it possible to load them in the appropriate IOVs when re-running on old data with newer CMSSW releases and/or newer HLT menus.

@fwyzard
Copy link
Contributor

fwyzard commented Sep 14, 2024

Yes, a database object can be developed, but that requires an extra level of complexity. What are the advantages of it?

That assumes that the python-based approach would stay, and a database approach would be added.
But the python-based approach should not be there at all.

So, a database approach does not require an "extra_ level of complexity", it only requires a different approach.

As to why adding a 1 MB python file with 50'000 parameters should be discouraged:

  • overall size: just like large data files that rarely change should not be put in the main repository (see https://cms-sw.github.io/cms_coding_rules.html#data-files);
  • efficiency: every single cmsRun job will have to parse the same parameters over and over again;
  • memory usage: have you measured how much memory does the resulting ParameterSet take ?
  • unsuitable for online: due to the way they are stored and versioned, HLT configurations are fully expanded; adding so many more parameters to it would be extremely wasteful, and something HLT L2 have already commented against.

To put it simply: this is not a configuration, this is data, and it should be treated accordingly.

@abdoulline
Copy link

abdoulline commented Sep 15, 2024

As a temporary "get though" workaround which was mentioned by Andrea above and which I've suggested ca a year ago -
can we ask Shahzad to put this "file of interest" (bulk/txt part of it) into the external cms-data ? 🤔

/cvmfs/cms.cern.ch/el8_amd64_gcc12/cms/cmssw/CMSSW_14_2_0_pre1/external/el8_amd64_gcc12/data/RecoLocalCalo/HcalRecAlgos/data ( HcalRecAlgos/data doesn't exist yet) ?

@mmusich
Copy link
Contributor

mmusich commented Sep 15, 2024

As a temporary "get though" workaround

Indeed this would avoid some of nastiest consequences of having a 12k lines python configuration fragment to load at runtime as well as making this slightly more maintainable (Marino and Andrea have given reasons enough above why this is really undesirable).
Allow me to remind that:

  • in order for this to be efficient you would need anyway to ship the parameters for execution via an EventSetup object. This requires definition of an ES class / record pair, independently of persisting in the DB;
  • with this "external" approach every "calibration" update is tied to a cmssw version change - with all the operational overhead it implies;
  • other subsystems have developed DB objects for similar purposes in the past;
  • "get-through" solutions generally tend to stick around indefinitely.

@perrotta
Copy link
Contributor

@igv4321 since you are developing this part of the code anew please consider using the condition database option that was suggested, whose advantage has been thoroughly detailed here above.
For a support, you can find the names of the AlCaDB contacts in HCAL from this list: https://twiki.cern.ch/twiki/bin/viewauth/CMS/AlCaDBContacts

@abdoulline
Copy link

abdoulline commented Sep 16, 2024

For a support, you can find the names of the AlCaDB contacts in HCAL from this list: https://twiki.cern.ch/twiki/bin/viewauth/CMS/AlCaDBContacts

I'm afraid HCAL (three) names are obsolete. Yildiray Komurcu to replace them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants