Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Overhaul lcov reports #1847

Closed
wants to merge 3 commits into from
Closed

Conversation

zackw
Copy link
Contributor

@zackw zackw commented Sep 6, 2024

Fixes #1846 (per my understanding of the lcov file format) and addresses several other issues. See individual commit logs for details.

Tested only with Python 3.11 and 3.12; I do not have convenient access to anything older.

I plan to come back to this branch next week and add a bunch more tests after comparing the reports generated on this branch to the troublesome reports I was getting that prompted #1846. Please do not merge yet. I'm filing this PR now so you know I'm working on it.

Zack Weinberg added 3 commits September 6, 2024 15:19
Implement skip_empty mode in LcovReporter.

Also, improve consistency with what you get by running ‘coverage xml’
and then ‘xml2lcov’ by:

- not emitting any vacuous TN: lines
- not emitting LF:0 LH:0 for empty files
- not emitting BRF:0 BRH:0 for files with no branches

En passant, better separation of concerns between report() and
get_lcov(), and change get_lcov’s name to lcov_file so that it
doesn’t sound like it returns something.
…efault

DA-line checksums bulk up the .lcov file and provide only a weak assurance
that the source file being processed for analysis matches the source file
that was used to generate coverage data.  Current versions of the LCOV suite
discourage their use, recommending VER: lines (carrying whole-file checksums)
and `--version-script` instead.

Add a configuration option, lcov.checksums, which can be set to “off”, “line”,
or “file”.  The default is “off”, which causes no checksums to be written at
all.  “line” gives the old behavior, and “file” emits one VER: line per file
with a sha256 hash of the entire file.  The documentation explains how to
tell the LCOV suite to process the VER: lines emitted by “file” mode.
…rts.

This fixes five serious bugs:

- The first field of a BRDA: line may not be zero (nedbat#1846)
- The first field of a BRDA: line is supposed to be the *source* line of a
  branch, not the destination line.
- The fourth field of a BRDA: line is supposed to be “-” when the branch
  was *never reached*, not when it was reached but never taken (which is
  what a branch’s presence in missing_arcs means).  As far as I can tell,
  coverage.py currently doesn’t know of the existence of branches that were
  never reached.

- The decision of whether to emit a DA: line at all is now taken according to
  what’s in analysis.statements, not what’s in analysis.executed and
  analysis.missing; this is important because some lines (e.g. the beginnings
  of docstrings) may appear in analysis.executed but *not* in
  analysis.statements.

- We no longer attempt to call branch-coverage-related Analysis methods
  when analysis.has_arcs is false.

And two minor annoyances:

- DA: and BRDA: lines are now emitted strictly in ascending order by (source)
  line number.
- Source file records are now sorted by *relative* pathname, not absolute
  pathname from the coverage database.
@zackw
Copy link
Contributor Author

zackw commented Sep 6, 2024

I will, of course, also fix all the test failures :-/

@nedbat
Copy link
Owner

nedbat commented Sep 6, 2024

Thanks for starting this. As some first comments after a quick spin through it:

  • Please try to follow the formatting already in the rest of the code: multi-line bracketed expressions should have a newline after the opening bracket, and don't indent to the opening bracket. Indent once.
  • There's a lot of discussion in the configuration page that should be moved the lcov command description.
  • Do we really need the different styles of checksumming? I haven't seen that in lcov before.

@zackw
Copy link
Contributor Author

zackw commented Sep 9, 2024

  • Do we really need the different styles of checksumming? I haven't seen that in lcov before.

I have the impression that the lcov team considers the per-line checksums that coverage lcov currently generates, to be deprecated in favor of the newer --version-script mechanism, which is how the lcov tools generate VER: lines. This is mostly going by what it says about per-line checksums in the geninfo manpage, particularly "checksum generation is disabled by default" and "this option is somewhat subsumed by the --version-script option".

There is a way to use the lcov tools to add VER: lines to a .lcov file that doesn't have them, but then you have to have both coverage.py and the lcov tools installed in the environment that's generating coverage data, which may be undesirable.

Instead of what I did, I'd be fine with either

  • only adding an option to turn off generation of per-line checksums
  • replacing the "file" mode with a mode in which coverage lcov runs an lcov-spec version script to generate VER: lines

Please let me know what you would prefer.

@nedbat
Copy link
Owner

nedbat commented Sep 9, 2024

Can we use the new mechanism with no choice? I'd rather not add options where people don't strongly need an option.

@zackw
Copy link
Contributor Author

zackw commented Sep 9, 2024

Can we use the new mechanism with no choice?

We would have to at least offer choice of version script. I read over the lcov docs some more and I'm now convinced that what I implemented - i.e. the hardcoded base64(sha256(file contents)) - was a mistake, the intention of the VER: feature is to allow people to pull in all sorts of contextual information (e.g. git describe) that they feel is relevant to whether the lcov file matches the source code.

For purpose of this pull request, I suggest that either turning off the per-line checksums unconditionally, or giving an option to turn them off, is the right thing, and generation of VER: lines can be a future thing if anyone wants them. I'd prefer the option because that gives a smoother backcompat story. I don't want to just leave them alone because (a) they appear to be semi-deprecated in lcov, and (b) they make it harder to compare the result of coverage lcov with the result of coverage xml followed by xml2lcov.

@zackw zackw marked this pull request as draft September 9, 2024 15:32
@zackw zackw mentioned this pull request Sep 9, 2024
@zackw
Copy link
Contributor Author

zackw commented Sep 9, 2024

Superseded by #1849.

@zackw zackw closed this Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lcov reports should not use line number zero in BRDA: records
2 participants