Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding HZZ electron MVA ID #37429

Closed
wants to merge 347 commits into from
Closed

Adding HZZ electron MVA ID #37429

wants to merge 347 commits into from

Conversation

asculac
Copy link
Contributor

@asculac asculac commented Apr 1, 2022

PR description:

  • adding HZZ training for electron ID, which is an EGgamma-approved ID (latest talk at egamma POG here ) used for H->4l, which cannot otherwise be recomputed on top of nanoAODs.
  • It is based on the same electronMVAValueMapProducer used for existing mvaIds, but with an updated and separate training for 2016UL, 2017UL and 2018UL.
  • A single variable mvaHZZIdIso is added to electrons. The appropriate training to be used is selected automatically. The idea is that the variable name will stay the same in the future, selecting a new training when appropriate.
  • For this reason the latest training (2018UL) is used as a default, but as new trainings will be available for run3 this should be updated accordingly.
  • No need to store working points as bools since these can be easily derived from the MVA value.

PR validation:

  • Tested in CMSSW_12_4_0_pre1; verified that the new variable is added correctly.
  • Adds 1 float per electron.

@namapane pls follow

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 1, 2022

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37429/29122

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 1, 2022

A new Pull Request was created by @asculac (Ana Sculac) for master.

It involves the following packages:

  • PhysicsTools/NanoAOD (xpog)

@cmsbuild, @mariadalfonso, @gouskos, @fgolf can you please review it and eventually sign? Thanks.
@gpetruc, @swertz this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@mariadalfonso
Copy link
Contributor

mariadalfonso commented Apr 1, 2022

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 1, 2022

-1

Failed Tests: RelVals RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9394e6/23593/summary.html
COMMIT: b429233
CMSSW: CMSSW_12_4_X_2022-03-31-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37429/23593/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9394e6/23593/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9394e6/23593/git-merge-result

RelVals

----- Begin Fatal Exception 01-Apr-2022 12:10:18 CEST-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 2 stream: 0
   [1] Running path 'dqmofflineOnPAT_2_step'
   [2] Prefetching for module SingleTopTChannelLeptonDQM_miniAOD/'singleTopElectronMediumDQM_miniAOD'
   [3] Prefetching for module PATElectronSlimmer/'slimmedElectrons'
   [4] Prefetching for module PATElectronSelector/'selectedPatElectrons'
   [5] Calling method for module PATElectronProducer/'patElectrons'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: edm::ValueMap<edm::Ptr<pat::UserData> >
Looking for module label: egmGsfElectronIDs
Looking for productInstanceName: mvaEleID-Summer16UL-ID-ISO-HZZ

   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "SkipEvent = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 01-Apr-2022 12:10:19 CEST-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 2 stream: 0
   [1] Running path 'dqmofflineOnPAT_2_step'
   [2] Prefetching for module SingleTopTChannelLeptonDQM_miniAOD/'singleTopElectronMediumDQM_miniAOD'
   [3] Prefetching for module PATElectronSlimmer/'slimmedElectrons'
   [4] Prefetching for module PATElectronSelector/'selectedPatElectrons'
   [5] Calling method for module PATElectronProducer/'patElectrons'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: edm::ValueMap<edm::Ptr<pat::UserData> >
Looking for module label: egmGsfElectronIDs
Looking for productInstanceName: mvaEleID-Summer16UL-ID-ISO-HZZ

   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "SkipEvent = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 01-Apr-2022 12:10:21 CEST-----------------------
An exception of category 'ProductNotFound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 2 stream: 0
   [1] Running path 'dqmofflineOnPAT_2_step'
   [2] Prefetching for module SingleTopTChannelLeptonDQM_miniAOD/'singleTopElectronMediumDQM_miniAOD'
   [3] Prefetching for module PATElectronSlimmer/'slimmedElectrons'
   [4] Prefetching for module PATElectronSelector/'selectedPatElectrons'
   [5] Calling method for module PATElectronProducer/'patElectrons'
Exception Message:
Principal::getByToken: Found zero products matching all criteria
Looking for type: edm::ValueMap<edm::Ptr<pat::UserData> >
Looking for module label: egmGsfElectronIDs
Looking for productInstanceName: mvaEleID-Summer16UL-ID-ISO-HZZ

   Additional Info:
      [a] If you wish to continue processing events after a ProductNotFound exception,
add "SkipEvent = cms.untracked.vstring('ProductNotFound')" to the "options" PSet in the configuration.

----- End Fatal Exception -------------------------------------------------
Expand to see more relval errors ...

RelVals-INPUT

  • 136.72412136.72412_RunJetHT2016B_reminiaodUL+RunJetHT2016B_reminiaodUL+REMININANO_data2016UL_HIPM+HARVESTDR2_REMININANO_data2016UL_HIPM/step2_RunJetHT2016B_reminiaodUL+RunJetHT2016B_reminiaodUL+REMININANO_data2016UL_HIPM+HARVESTDR2_REMININANO_data2016UL_HIPM.log

@mariadalfonso
Copy link
Contributor

mariadalfonso commented Apr 4, 2022

@asculac

The failure in on Run3 2021 scenarios 11634.914, 11634.0, 11634.7, 11634.911, 12434.0, 11834.0
You need to check why your modules ID are not re-runned when calling this
https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/NanoAOD/python/nano_cff.py#L224

Comment on lines 312 to 320
run2_egamma_2016.toModify(slimmedElectronsWithUserData.userFloats,
mvaHZZIdIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer16ULIdIsoValues"),
)
run2_egamma_2017.toModify(slimmedElectronsWithUserData.userFloats,
mvaHZZIdIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer17ULIdIsoValues") ,
)
run2_egamma_2018.toModify(slimmedElectronsWithUserData.userFloats,
mvaHZZIdIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Summer18ULIdIsoValues"),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These modifiers act also on the pre-UL.

Comment on lines 200 to 201
mvaFall17V2Iso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Fall17IsoV2Values"),
mvaFall17V2noIso = cms.InputTag("electronMVAValueMapProducer:ElectronMVAEstimatorRun2Fall17NoIsoV2Values"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rename these ID to be generic.
In Run3 we will soon move out of the "Fall17V2" training .
So in the nano we should have one variable generica name and will be dinamically filled with the right ID for the Run2 ("Fall17V2" training) and Run3 (the one to come)
mvaFall17V2Iso --> mvaIso
mvaFall17V2noIso --> mvanoIso
The docs will point to the right one.

At the analysis level this will help the combined analysis of the Run2 and Run3 dataset, that is our main goal for next Run2 nano production.

@lfinco @swagata87 @rgoldouz

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mariadalfonso ,
yes I agree.
@asculac you can implement it in this PR. Or if you feel this is out of scope of this PR, then one of us can do it in a follow-up PR. But indeed, the naming of mva egamma ID branches could be improved. (The cutbased ID branch names already looks good.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a very good idea; I would then suggest to agree on consistent naming;
mvaIso
mvanoIso (or better mvaNoIso)?
mvaHZZIso (instead of mvaHZZIdIsoValue)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@swagata87 sure, I will add it to this PR it is not a problem
@mariadalfonso thank you for pointing everything out, I will check it and fix it

@mariadalfonso
Copy link
Contributor

@asculac
any update ?

@asculac
Copy link
Contributor Author

asculac commented May 6, 2022

@asculac any update ?

Hi @mariadalfonso , I apologise for the delay, unfortunately I didn't have time to look into details why our modules are not run when calling the VID producer. I will try to solve the problem in next week to avoid further delay with this PR

@smuzaffar smuzaffar modified the milestones: CMSSW_12_4_X, CMSSW_12_5_X May 16, 2022
@mariadalfonso
Copy link
Contributor

@asculac any update ?

@asculac
Copy link
Contributor Author

asculac commented May 23, 2022

@asculac any update ?

Yes, our ID is not embedded in PAT object and this is causing the error. I am working currently on implementation.

@swagata87
Copy link
Contributor

@asculac
it looks like some changes might be needed in PhysicsTools/PatAlgos/python/slimming/miniAOD_tools.py
could you add your IDs under electron_ids here?

electron_ids = ['RecoEgamma.ElectronIdentification.Identification.heepElectronID_HEEPV70_cff',
'RecoEgamma.ElectronIdentification.Identification.heepElectronID_HEEPV71_cff',
'RecoEgamma.ElectronIdentification.Identification.cutBasedElectronID_Fall17_94X_V1_cff',
'RecoEgamma.ElectronIdentification.Identification.cutBasedElectronID_Fall17_94X_V2_cff',
'RecoEgamma.ElectronIdentification.Identification.mvaElectronID_Fall17_noIso_V1_cff',
'RecoEgamma.ElectronIdentification.Identification.mvaElectronID_Fall17_iso_V1_cff',
'RecoEgamma.ElectronIdentification.Identification.mvaElectronID_Fall17_noIso_V2_cff',
'RecoEgamma.ElectronIdentification.Identification.mvaElectronID_Fall17_iso_V2_cff',
'RecoEgamma.ElectronIdentification.Identification.cutBasedElectronID_Summer16_80X_V1_cff',
'RecoEgamma.ElectronIdentification.Identification.mvaElectronID_Spring16_GeneralPurpose_V1_cff',
'RecoEgamma.ElectronIdentification.Identification.mvaElectronID_Spring16_HZZ_V1_cff',
]

i.e, add the following:

                    'RecoEgamma.ElectronIdentification.Identification.mvaElectronID_Summer16UL_ID_ISO_cff',
                    'RecoEgamma.ElectronIdentification.Identification.mvaElectronID_Summer17UL_ID_ISO_cff',
                    'RecoEgamma.ElectronIdentification.Identification.mvaElectronID_Summer18UL_ID_ISO_cff'

After that we can test your PR again to check if the issue is solved

@swagata87
Copy link
Contributor

The failure in on Run3 2021 scenarios 11634.914, 11634.0, 11634.7, 11634.911, 12434.0, 11834.0 You need to check why your modules ID are not re-runned when calling this https://github.com/cms-sw/cmssw/blob/master/PhysicsTools/NanoAOD/python/nano_cff.py#L224

Run3-specific failures ring a bell..
Ana, you might need to also add your IDs in the following 2 places inside PhysicsTools/NanoAOD/python/electrons_cff.py;

here:

(run3_nanoAOD_devel).toModify(slimmedElectronsWithUserData.userFloats,
mvaFall17V1Iso = None,
mvaFall17V1noIso = None,
mvaFall17V2Iso = None,
mvaFall17V2noIso = None,

and just in the next block, i.e. here:

(run3_nanoAOD_devel).toModify(electronTable.variables,
mvaFall17V2Iso = None,
mvaFall17V2Iso_WP80 = None,
mvaFall17V2Iso_WP90 = None,
mvaFall17V2Iso_WPL = None,
mvaFall17V2noIso = None,
mvaFall17V2noIso_WP80 = None,
mvaFall17V2noIso_WP90 = None,
mvaFall17V2noIso_WPL = None,
vidNestedWPBitmapHEEP = None,
vidNestedWPBitmap = None,
cutBased = None,
cutBased_HEEP = None,
)

does this make sense?

@swagata87
Copy link
Contributor

type egamma

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 8, 2022

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37429/30437

@jpata
Copy link
Contributor

jpata commented Jun 8, 2022

it looks like the rebase was not correct - a lot of unrelated commits were included.
just in case: http://cms-sw.github.io/tutorial-resolve-conflicts.html

@tvami
Copy link
Contributor

tvami commented Jun 8, 2022

-1

  • needs to rebase

@asculac
Copy link
Contributor Author

asculac commented Jun 9, 2022

@mariadalfonso Is it okay if I close this PR and open another clean one with the same implementation? To avoid messing up with rebase again from my side

@mariadalfonso
Copy link
Contributor

@mariadalfonso Is it okay if I close this PR and open another clean one with the same implementation? To avoid messing up with rebase again from my side

ok, fine with me.
[In principle you an also create a new local branch and then push to the remote asculac:mvaHZZ_unique]

@missirol
Copy link
Contributor

@asculac

This PR is creating spurious warnings in the tests of unrelated PRs (example).

Please rebase it, or close it. Thanks!

@asculac
Copy link
Contributor Author

asculac commented Jun 13, 2022

@asculac

This PR is creating spurious warnings in the tests of unrelated PRs (example).

Please rebase it, or close it. Thanks!

Closing this PR and following up with a clean one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment