Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random DQM differences in PR tests #31644

Closed
makortel opened this issue Oct 1, 2020 · 43 comments
Closed

Random DQM differences in PR tests #31644

makortel opened this issue Oct 1, 2020 · 43 comments

Comments

@makortel
Copy link
Contributor

makortel commented Oct 1, 2020

I've seen tiny changes in

  • workflow 158.01 ParticleFlow/JetResponse/slimmedJetsPuppi/JEC/preso_eta30
  • workflow 23234.0 SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
  • workflow 28234.0 SiOuterTrackerV/Tracks/FinalResolution/VtxZResolution

in the tests of PRs #31546 and #31642 that should not have any impact on how those histograms are filled.

@makortel
Copy link
Contributor Author

makortel commented Oct 1, 2020

assign dqm

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 1, 2020

New categories assigned: dqm

@jfernan2,@andrius-k,@fioriNTU,@kmaeshima,@ErnestaP you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 1, 2020

A new Issue was created by @makortel Matti Kortelainen.

@Dr15Jones, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@jfernan2
Copy link
Contributor

jfernan2 commented Oct 2, 2020

Hi @makortel I don't see the differences in the exact PR numbers you quote but I have seen them randomly around for other PRs I have reviewed, specially the SiOuterTracker differences.

I have ignored them since they seem to me our famous stat box inconsistency: the top most stat box is from ROOT, and shows a rounded value, the other ones are from the GUI and show the true value from the data. So, it looks to me like a precision problem, indeed if you plot on side instead of overlay, you get the same histo, e.g.:
https://tinyurl.com/yyyfbjfa
https://tinyurl.com/y55xw9ky

And also because RelMon does NOT spot the difference, just the Bin By Bin tool e.g.:
#31639 (comment)
#31309 (comment)

@makortel
Copy link
Contributor Author

makortel commented Oct 2, 2020

In the PRs I quoted these differences are shown in #31546 (comment) and #31642 (comment), and I do see them flagged by RelMon. I agree it looks some kind of precision issue.

So, it looks to me like a precision problem, indeed if you plot on side instead of overlay, you get the same histo, e.g.:
https://tinyurl.com/yyyfbjfa

The stat box indeed looks the same, but the error bars look different, so maybe something tiny is indeed different between the two?

Anyway, I consider this issue more of documenting "known" failures than something needing immediate fix (even if latter would be nice).

@makortel
Copy link
Contributor Author

makortel commented Oct 2, 2020

Here is one more #29553 (comment)

@jfernan2
Copy link
Contributor

jfernan2 commented Oct 2, 2020

In the PRs I quoted these differences are shown in #31546 (comment) and #31642 (comment),
OK, I had looked to the last Jenkins tests, where indeed they disappeared

@makortel
Copy link
Contributor Author

Here is one more (SiOuterTrackerV only) #31715 (comment)

@makortel
Copy link
Contributor Author

#31815 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,PhiResolution}

@makortel
Copy link
Contributor Author

makortel commented Oct 19, 2020

#31858 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,PhiResolution}

@kpedro88
Copy link
Contributor

The same occurs in #31861 (comment)

@mmusich
Copy link
Contributor

mmusich commented Oct 20, 2020

@emacdonald16, FYI

@emacdonald16
Copy link
Contributor

Weird. I'll look into it. It does look like a precision issue...

@makortel
Copy link
Contributor Author

#31930 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,PhiResolution}

@makortel
Copy link
Contributor Author

makortel commented Nov 5, 2020

cms-sw/cms-bot#1409 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/VtxZResolution
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/PhiResolution

@makortel
Copy link
Contributor Author

makortel commented Nov 24, 2020

#32217 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,PhiResolution}
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,VtxZResolution}

@makortel
Copy link
Contributor Author

#32244 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,PhiResolution}
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,VtxZResolution}

@makortel
Copy link
Contributor Author

#32184 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,PhiResolution}
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,VtxZResolution}

@makortel
Copy link
Contributor Author

makortel commented Dec 1, 2020

#29553 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,VtxZResolution}

@makortel
Copy link
Contributor Author

#32487 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,VtxZResolution}

@makortel
Copy link
Contributor Author

#32499 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,VtxZResolution}

@makortel
Copy link
Contributor Author

#32510 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,VtxZResolution}

@makortel
Copy link
Contributor Author

#32622 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,VtxZResolution}

@makortel
Copy link
Contributor Author

makortel commented Feb 2, 2021

#32782 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution, VtxZResolution}
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,VtxZResolution}

@makortel
Copy link
Contributor Author

#32947 (comment) shows tiny differences in

  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution, VtxZResolution}
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/{d0Resolution,VtxZResolution}

@makortel
Copy link
Contributor Author

makortel commented Mar 2, 2021

#33038 (comment) shows tiny differences in

  • 312.0: L1TEMU/L1TdeStage2uGMT/data_vs_emulator_comparison/{mismatchRatio,errorSummaryNum}
  • 23234.0: SiOuterTrackerV/Tracks/FinalResolution/d0Resolution
  • 28234.0: SiOuterTrackerV/Tracks/FinalResolution/d0Resolution

@jfernan2
Copy link
Contributor

jfernan2 commented Mar 5, 2021

@benkrikler @dinyar could you lease have a look at the L1TEMU/L1TdeStage2uGMT/data_vs_emulator_comparison/{mismatchRatio,errorSummaryNum} diffeences? Thanks

@dinyar
Copy link
Contributor

dinyar commented Mar 5, 2021

Hi @jfernan2,

I'm not sure what is being compared (i.e. what the base comparison is), but my recent PR fixed the incorrect conversion from the hardware code of the unconstrained pT to the physical value, so the unconstrained pT is expected to have changed now.

Cheers,
Dinyar

@makortel
Copy link
Contributor Author

makortel commented Mar 5, 2021

@dinyar In #33038 (comment) the comparison was IB+PR vs IB in CMSSW_11_3_X_2021-03-02-1100. As far as I can tell no other PRs were included on top of the IB in that test.

@dinyar
Copy link
Contributor

dinyar commented Mar 5, 2021

@makortel ok, thanks for the information. I'm a bit mystified by this to be honest I'll try to reproduce locally.

@makortel
Copy link
Contributor Author

makortel commented Mar 9, 2021

#33106 (comment) shows tiny differences in

  • 11634.0: L1TEMU/L1TdeStage2uGMT/data_vs_emulator_comparison/{mismatchRatio,errorSummaryNum}

@dinyar
Copy link
Contributor

dinyar commented Mar 9, 2021

Hi again,

I ran locally but couldn't reproduce the difference between the commits in question. I did notice that the mismatches are between EMTF muons where in the unpacked uGMT output we see unconstrained pT > 0, however in the emulated uGMT output their upt == 0. I suspect the reason for the data-emu mismatches is that the RegionalMuon packer/unpacker doesn't expect the EMTF to have displaced quantities yet, in this case I have to fix that. @eyigitba should confirm though whether we're even supposed to see displaced quantities yet.

As to why we see a difference between base and base+PR in #33038 and #33106 I'm a bit lost. As mentioned above I can't reproduce it locally and it seems to go both ways: in #33038 the mismatches appeared after applying the patch, in #33106 they seem to have gone away again (but in a different workflow.. ). Can there be other moving pieces in the background like the PU sample or something like it?

Cheers,
Dinyar

@eyigitba
Copy link
Contributor

eyigitba commented Mar 9, 2021

Hi @dinyar , we don't have the displaced muons in the firmware right now. So you shouldn't see any unpacked displaced quantity from EMTF. I don't know why the unpacked uGMT output has non-zero unconstrained pT for EMTF muons.

@dinyar
Copy link
Contributor

dinyar commented Mar 9, 2021

Right, sorry I wasn't clear. Here "unpacked" still refers to MC, so if the EMTF emulator creates displaced information that would qualify if I'm not forgetting something.

@eyigitba
Copy link
Contributor

eyigitba commented Mar 9, 2021

I see ok, my bad. I thought on the emulator side you didn't need any changes to uGMT, but maybe on the unpacker side you still need to (although we don't have a packer for the emulator so I don't know if this applies here). Anyway I don't know what the difference would be between these commits. We didn't change anything on EMTF side.

@dinyar
Copy link
Contributor

dinyar commented Mar 9, 2021

Ok, yeah I thought so. I now noticed there was some update/change of TensorFlow a few commits ago which to my knowledge you use.. could this change the upt values in some way?

@eyigitba
Copy link
Contributor

eyigitba commented Mar 9, 2021

Maybe it can. I don't really know. I need to check on my end.

@eyigitba
Copy link
Contributor

eyigitba commented Mar 9, 2021

Actually can you point me to the commit about TensorFlow?

@dinyar
Copy link
Contributor

dinyar commented Mar 9, 2021

Sure, I saw mention of it in 502d254.

@eyigitba
Copy link
Contributor

eyigitba commented Mar 9, 2021

I don't think this should change anything. I didn't do any checks in the last month or so though. I'll confirm soon if everything works as expected.

@makortel
Copy link
Contributor Author

makortel commented Mar 9, 2021

@dinyar Because the tiny differences are visible in tests of unrelated PRs, they likely have some random origin. The cause could be e.g. an uninitialized variable somewhere. The tests are run such that all other conditions except chances in the PR should be same (some times already merged PRs are included). If anything in the event content would change, we should see many other differences as well.

I'm partly reporting any occurrence I see to get a picture how frequent these random differences are.

@jfernan2
Copy link
Contributor

+1
This issue looks to me is not present any longer

@cmsbuild
Copy link
Contributor

This issue is fully signed and ready to be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants