Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to instrument using 64-bit counts #51625

Merged
merged 8 commits into from
May 3, 2021

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Apr 21, 2021

Add COMPlus_JitCollect64BitCounts which makes the JIT instrument using
64-bit counts instead of 32-bit counts.

I also changed the printing of relocs to include their values when
diffable disassembly is off.

Add COMPlus_JitCollect64BitCounts which makes the JIT instrument using
64-bit counts instead of 32-bit counts. No support for consuming these
counts is added, only support for producing them.

I also changed the printing of relocs to include their values when
diffable disassembly is off.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 21, 2021
Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confess I have a slight preference for the old names. The new type could be BasicBlockLongCount for instance. But I can get used to these names.

I wonder if we should finish wiring this up in case we're seeing count saturation in our training scenarios (I have not seen evidence of this, but haven't really looked). Aside from the missing count consumption parts in the jit there is a counter for type histograms that will overflow if block counts are overflowing.

cc @davidwrighton

@jakobbotsch
Copy link
Member Author

No worries, I'll change the names back and use Long for the new one instead.

@jakobbotsch
Copy link
Member Author

I wonder if we should finish wiring this up in case we're seeing count saturation in our training scenarios (I have not seen evidence of this, but haven't really looked). Aside from the missing count consumption parts in the jit there is a counter for type histograms that will overflow if block counts are overflowing.

I can certainly spend some more time to get the consumption part working as well. However, given that we only see overflows with COMPlus_TC_CallCounting=0, it seems unlikely that overflows happen in practice. I suppose it could happen in the real world if a function is stuck in tier 0, however. In that case I'm not sure if you would want to enable 64-bit counts over OSR, though.

@davidwrighton
Copy link
Member

@jakobbotsch our training scenarios don't quite set COMPLUS_TC_CallCounting to 0, but they do something very similar to collect instrumentation data. So it very well might be interesting to turn that on for training data collection

@davidwrighton
Copy link
Member

All of this reminds me that I desperately need to build a set of tests which actually exercise the entire static pgo pipeline within our testbed, as I keep missing steps in validation.

@jakobbotsch
Copy link
Member Author

Makes sense, I'll work on threading the 64-bit counts through more thoroughly then.

break;
default:
continue;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done this so far, but shouldn't we ideally also be taking other PGO data into account here? @AndyAyersMS

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps? I left these out on purpose, though we might reconsider.

The class profile data ends up interspersed with the count data, so the hash below will typically incorporate those values; the schema records themselves don't contain that much interesting content.

@jakobbotsch
Copy link
Member Author

@davidwrighton can you advise on what needs to be done with crossgen/crossgen2? Do both need to be updated? The JIT side of this change should be done now.

@jakobbotsch
Copy link
Member Author

@AndyAyersMS this should be good for a review, I guess the runtime parts can come in a separate PR.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Dynamic PGO should be able to consume these counts, right? In that case the runtime doesn't do much with the data....

break;
default:
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps? I left these out on purpose, though we might reconsider.

The class profile data ends up interspersed with the count data, so the hash below will typically incorporate those values; the schema records themselves don't contain that much interesting content.

@jakobbotsch
Copy link
Member Author

Dynamic PGO should be able to consume these counts, right? In that case the runtime doesn't do much with the data....

Right, I meant crossgen/crossgen2. I checked and indeed with COMPlus_TieredPGO=1 and COMPlus_JitCollect64BitCounts=1 the counts are roundtripped correctly.

@jakobbotsch
Copy link
Member Author

@jakobbotsch jakobbotsch merged commit dfbe3ae into dotnet:main May 3, 2021
@jakobbotsch jakobbotsch deleted the instrument-64-bit branch May 3, 2021 12:39
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jun 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants