Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip other sections when reading metadata #826

Merged
merged 1 commit into from
Sep 17, 2024
Merged

Conversation

jtibshirani
Copy link
Member

Looking at heap profiles, the ReadMetadata function creates a ton of garbage objects. The main contributor is in other sections from the TOC, specifically decoding compoundSection.offsets . However, to read metadata, we only really need to parse the metadata sections.

This PR introduces a skip method that skips over a section without reading it. This greatly reduces the allocations from ReadMetadata:

BenchmarkReadMetadata
BenchmarkReadMetadata-10    	   20908	     57245 ns/op	  184963 B/op	     118 allocs/op (before)
BenchmarkReadMetadata-10    	   67215	     17937 ns/op	    9688 B/op	     111 allocs/op (after)

@cla-bot cla-bot bot added the cla-signed label Sep 14, 2024
@@ -126,11 +122,14 @@ func (r *reader) readTOC(toc *indexTOC) error {
return err
}

skipSection := len(tags) > 0 && !slices.Contains(tags, tag)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of introducing the "skip" concept, I could have taken advantage of the fact that the metadata sections are always first in the TOC. However, our index reading code is structured around flexible "section tags", and I got the feeling that section ordering wasn't an invariant we wanted to rely on.

@@ -169,6 +174,27 @@ func (r *reader) readTOC(toc *indexTOC) error {
return nil
}

func (r *reader) readHeader() (simpleSection, uint32, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I factored out the first part of readTOC (now readTOCSections). This wasn't critical for the change.

@@ -395,9 +421,9 @@ func (r *reader) readIndexData(toc *indexTOC) (*indexData, error) {
return &d, nil
}

func (r *reader) readMetadata(toc *indexTOC) ([]*Repository, *IndexMetadata, error) {
func (r *reader) parseMetadata(metaData simpleSection, repoMetaData simpleSection) ([]*Repository, *IndexMetadata, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also simplified this method, as it's not a big deal to be copying simpleSection. Not critical for the change.

Copy link
Member

@keegancsmith keegancsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find!!!

@jtibshirani jtibshirani requested a review from a team September 16, 2024 20:47
Base automatically changed from jtibs/index-toc to main September 17, 2024 01:34
@jtibshirani jtibshirani merged commit be438ef into main Sep 17, 2024
9 checks passed
@jtibshirani jtibshirani deleted the jtibs/metadata branch September 17, 2024 02:00
jtibshirani added a commit that referenced this pull request Sep 19, 2024
Tiny follow up to #826. I resolved a conflict incorrectly and reverted a log line improvement.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants