Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Unified Particle Transformer AK4 jet tagger #44641

Merged
merged 13 commits into from
Apr 16, 2024

Conversation

AlexDeMoor
Copy link
Contributor

This PR introduce UnifiedParticleTransformerAK4 a novel inclusive tagger for jet. This network will perform an inclusive tagging combining b/c/tau/lep tagging with jet energy regression (both the regression and the resolution estimation via quantile regression). The model is trained with a specific robust training combining improved adversarial training and domain adaptation for reducing the impact of the data/MC disagreement on the final performance. The output nodes of the domains are also kept for exploring the possibility of efficiency mapping and their impact later.

An overview of the method can be seen at: https://indico.cern.ch/event/1368069/contributions/5793148/
A focus on the novel adversarial training is described here: https://indico.cern.ch/event/1372038/#3-adversarial-training-for-par
The preliminary results of the model were shown in the following meeting: https://indico.cern.ch/event/1397392/#17-preliminary-results-of-part
The final results will be shared this Monday: https://indico.cern.ch/event/1403350/#3-part-2024-final-results-and

This PR requires the associated ONNX model which has been submitted in the adequate RecoBTag-Combined repo: cms-data/RecoBTag-Combined#57

For your information: a last training is ongoing trying to improve the current performance via an enriched dataset. A modification of the final model could occur. This will only affect the RecoBTag-Combined PR, not this one.

This pull request is for the master branch but will be backported for the 2024 data-taking. If anyone could specify in which release I should backport, their information is welcome.

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2024

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44641/39837

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2024

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44641/39839

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2024

A new Pull Request was created by @AlexDeMoor for master.

It involves the following packages:

  • DataFormats/BTauReco (reconstruction)
  • PhysicsTools/NanoAOD (xpog)
  • PhysicsTools/PatAlgos (reconstruction, xpog)
  • RecoBTag/Configuration (reconstruction)
  • RecoBTag/FeatureTools (reconstruction)
  • RecoBTag/ONNXRuntime (reconstruction)

@hqucms, @vlimant, @cmsbuild, @jfernan2, @mandrenguyen can you please review it and eventually sign? Thanks.
@gpetruc, @mbluj, @andrzejnovak, @demuller, @AlexDeMoor, @emilbols, @mmarionncern, @hatakeyamak, @azotz, @jdamgov, @JyothsnaKomaragiri, @jdolen, @nhanvtran, @hqucms, @rappoccio, @rovere, @gouskos, @ahinzmann, @Ming-Yan, @mariadalfonso, @schoef, @AnnikaStein, @seemasharmafnal, @missirol, @gkasieczka, @Senphy this is something you requested to watch as well.
@rappoccio, @sextonkennedy, @antoniovilela you are the release manager for this.

cms-bot commands are listed here

@mandrenguyen
Copy link
Contributor

please test with cms-data/RecoBTag-Combined#57

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 6, 2024

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-157120/38649/summary.html
COMMIT: 8de2908
CMSSW: CMSSW_14_1_X_2024-04-05-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44641/38649/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 1 errors in the following unit tests:

---> test runtestPhysicsToolsPatAlgos had ERRORS

Comparison Summary

Summary:

  • You potentially removed 111 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 629 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3307717
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3307691
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

desc.add<edm::InputTag>("puppi_value_map", edm::InputTag("puppi"));
desc.add<edm::InputTag>("secondary_vertices", edm::InputTag("inclusiveCandidateSecondaryVertices"));
desc.add<edm::InputTag>("jets", edm::InputTag("ak4PFJetsCHS"));
desc.addUntracked<edm::InputTag>("unsubjet_map", {});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
desc.addUntracked<edm::InputTag>("unsubjet_map", {});
desc.add<edm::InputTag>("unsubjet_map", {});

I think it should be "tracked", see #44591.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I will adjust it 👍

@hqucms
Copy link
Contributor

hqucms commented Apr 16, 2024

Again the failed unittest in runtestPhysicsToolsPatAlgos is in the plain IB and not related to this PR.

@antoniovilela
Copy link
Contributor

ignore tests-rejected with ib-failure

@antoniovilela
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit a8c349a into cms-sw:master Apr 16, 2024
13 of 14 checks passed
@hqucms
Copy link
Contributor

hqucms commented Apr 16, 2024

@antoniovilela I think we need to merge cms-data/RecoBTag-Combined#57 which is required by this PR.

@mandrenguyen
Copy link
Contributor

@antoniovilela I think we need to merge cms-data/RecoBTag-Combined#57 which is required by this PR.

The model was updated after this PR was tested. Shouldn't we relaunch the tests?

@hqucms
Copy link
Contributor

hqucms commented Apr 16, 2024

@antoniovilela I think we need to merge cms-data/RecoBTag-Combined#57 which is required by this PR.

The model was updated after this PR was tested. Shouldn't we relaunch the tests?

@mandrenguyen Nice catch -- indeed we should re-run the tests.

BTW we were informed by the developers that they would probably update the model with some minor fixes. Maybe @AlexDeMoor could just add some more info here to bring everyone up to date?

@hqucms
Copy link
Contributor

hqucms commented Apr 16, 2024

please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-157120/38868/summary.html
COMMIT: 26a35d1
CMSSW: CMSSW_14_1_X_2024-04-15-2300/el8_amd64_gcc12
Additional Tests: NANO
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44641/38868/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-157120/38868/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-157120/38868/git-merge-result

Unit Tests

I found 1 errors in the following unit tests:

---> test runtestPhysicsToolsPatAlgos had ERRORS

Comparison Summary

Summary:

  • You potentially removed 2 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 842 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3319227
  • DQMHistoTests: Total failures: 56
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3319151
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 15.244999999999997 KiB( 47 files compared)
  • DQMHistoSizes: changed ( 11634.0,... ): 1.229 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 13234.0,... ): 0.738 KiB Physics/NanoAODDQM
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

NANO Comparison Summary

Summary:

  • You potentially added 52 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 59 differences found in the comparisons
  • DQMHistoTests: Total files compared: 15
  • DQMHistoTests: Total histograms compared: 16456
  • DQMHistoTests: Total failures: 35
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 16421
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 13.768999999999998 KiB( 14 files compared)
  • DQMHistoSizes: changed ( 2500.001,... ): 1.229 KiB Physics/NanoAODDQM
  • DQMHistoSizes: changed ( 2500.011,... ): 0.738 KiB Physics/NanoAODDQM
  • Checked 55 log files, 32 edm output root files, 15 DQM output files

Nano size comparison Summary:

Sample kb/ev ref kb/ev diff kb/ev ev/s/thd ref ev/s/thd diff rate mem/thd ref mem/thd
2500.0 2.774 2.695 0.079 ( +2.9% ) 3.43 4.31 -20.4% 2.118 2.216
2500.001 2.887 2.805 0.082 ( +2.9% ) 3.10 3.85 -19.6% 2.132 2.240
2500.002 2.834 2.753 0.081 ( +2.9% ) 3.22 3.98 -19.0% 2.146 2.260
2500.01 1.440 1.386 0.054 ( +3.9% ) 5.82 7.25 -19.7% 1.982 2.035
2500.011 1.898 1.823 0.074 ( +4.1% ) 3.23 4.30 -24.8% 2.120 2.090
2500.012 1.754 1.687 0.067 ( +3.9% ) 4.62 6.00 -23.1% 1.979 2.072
2500.1 2.345 2.339 0.006 ( +0.3% ) 4.40 4.40 +0.1% 2.052 2.044
2500.2 2.449 2.443 0.005 ( +0.2% ) 5.02 4.95 +1.5% 1.917 1.939
2500.21 1.279 1.274 0.005 ( +0.4% ) 3.42 3.44 -0.7% 1.850 1.967
2500.211 1.659 1.653 0.006 ( +0.3% ) 2.99 3.00 -0.4% 1.944 2.061
2500.3 2.219 2.214 0.005 ( +0.2% ) 9.10 9.00 +1.2% 1.944 1.944
2500.301 2.822 2.815 0.008 ( +0.3% ) 7.82 7.80 +0.3% 1.957 1.960
2500.31 7.164 7.164 0.000 ( +0.0% ) 1.39 1.38 +0.6% 1.704 1.702
2500.311 1.568 1.568 0.000 ( +0.0% ) 6.27 6.85 -8.4% 1.056 1.051
2500.312 540.457 540.457 0.000 ( +0.0% ) 0.50 0.50 +1.5% 1.596 1.596
2500.313 817.694 817.694 0.000 ( +0.0% ) 0.70 0.69 +1.3% 1.613 1.579
2500.32 1.341 1.354 -0.013 ( -0.9% ) 11.96 12.01 -0.4% 1.739 2.004
2500.321 1.748 1.761 -0.013 ( -0.7% ) 7.98 8.26 -3.4% 2.169 2.437
2500.322 1.240 1.240 0.000 ( +0.0% ) 8.72 8.54 +2.1% 2.211 1.752
2500.323 7.772 7.772 0.000 ( +0.0% ) 3.21 3.19 +0.6% 1.930 1.933
2500.324 1.870 1.882 -0.013 ( -0.7% ) 8.53 8.31 +2.7% 1.764 1.766
2500.325 4.156 4.291 -0.135 ( -3.1% ) 4.08 3.82 +6.7% 1.774 1.723
2500.326 3.309 3.205 0.104 ( +3.2% ) 1.57 1.65 -4.7% 1.809 1.842
2500.327 1.804 1.816 -0.013 ( -0.7% ) 8.51 8.68 -1.9% 2.149 2.343
2500.4 2.363 2.388 -0.025 ( -1.0% ) 8.63 8.72 -1.1% 1.798 1.818
2500.401 1.891 1.891 0.000 ( +0.0% ) 7.39 7.69 -4.0% 1.690 1.699
2500.402 2.937 2.962 -0.025 ( -0.8% ) 7.29 7.50 -2.9% 1.735 1.906
2500.403 8.687 8.918 -0.231 ( -2.6% ) 2.78 2.58 +7.7% 1.757 1.927
2500.404 5.438 5.272 0.167 ( +3.2% ) 1.16 1.26 -7.7% 1.826 1.728
2500.405 2.847 2.872 -0.025 ( -0.9% ) 7.46 7.47 -0.2% 1.768 1.861
2500.5 5.194 5.194 0.000 ( +0.0% ) 15.22 15.13 +0.6% 1.559 1.502
2500.51 9.120 9.120 0.000 ( +0.0% ) 9.37 9.38 -0.1% 1.515 1.513

@antoniovilela
Copy link
Contributor

Sorry, I lost track of the external.

@mandrenguyen
Copy link
Contributor

mandrenguyen commented Apr 19, 2024

type btv, jetmet, tau

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants