Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deduplicate references lists in tsbuildinfo #43079

Closed

Conversation

sokra
Copy link
Contributor

@sokra sokra commented Mar 4, 2021

When referencedMap or exportedModulesMap contain duplicates, they are replaced with an index of the deduplicated value in referencedMapLists resp. exportedModulesMapLists.

This is actually quite common as .d.ts and .ts file often share references.

This saves a bit storage, makes restoring faster and due to using references to a shared list (since immutable) this also saves a bit of memory.

Finding duplicates during storing can be expensive, so we make use of the equal identity when this was restored from tsbuildinfo. Otherwise we create a Map by list length to avoid checking too many lists.

  • There is an associated issue in the Backlog milestone (required)
  • Code is up-to-date with the master branch
  • You've successfully run gulp runtests locally
  • There are new or updated unit tests validating the change

@typescript-bot typescript-bot added the For Uncommitted Bug PR for untriaged, rejected, closed or missing bug label Mar 4, 2021
@typescript-bot
Copy link
Collaborator

This PR doesn't have any linked issues. Please open an issue that references this PR. From there we can discuss and prioritise.

referencedMap?: MapLike<string[] | number>;
referencedMapLists?: string[][];
exportedModulesMap?: MapLike<string[] | number>;
exportedModulesMapLists?: string[][];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can simplify and gain even more if we have single list for referencedMapLists as well as exportedModulesMapLists. Also instead of sometimes deduplicating we can make referencedMap and exportedModulesMap as MapLike<number> and the list of reference set is separate which is deduplicated across the board.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that, and also deduplicated the files itself.

Sadly that makes all baselines unreadable, maybe we should decompress it for the baseline output to make it readable again...

@sheetalkamat sheetalkamat self-assigned this Mar 4, 2021
@sokra sokra force-pushed the performance/tsbuildinfo-deduplication branch from d29b92e to e4c7402 Compare March 5, 2021 16:07
@sokra
Copy link
Contributor Author

sokra commented Mar 5, 2021

Some statistics:

For my project this reduces the tsbuildinfo size from 14.9 MB to 2.2MB.

@sokra
Copy link
Contributor Author

sokra commented Mar 5, 2021

btw. is it necessary to emit a formatted json (JSON.stringify(data, null, 2)) instead of just pure json (JSON.stringify(data))? That would save nearly the half of the size...

sheetalkamat added a commit that referenced this pull request Mar 9, 2021
Different implementation of #43079 based on idea suggested by @sokra
@sheetalkamat
Copy link
Member

sheetalkamat commented Mar 9, 2021

@sokra Created PR #43155 based off of this one with simplification and additional optimization of fileinfos, semantic diagnostic and your suggestion of getting rid of spaces in tsbuildinfo.
Bot should post a build on that PR as per #43155 (comment). Can you check if this works for your scenario.
Thanks

sheetalkamat added a commit that referenced this pull request Mar 10, 2021
* Baseline readable buildinfo

* Use file names as index in file name list
This is extension of the idea given by @sokra to optimize size of tsbuildinfo

* Deduplicate reference map lists and use file name index to sort them
Different implementation of #43079 based on idea suggested by @sokra

* Minimal json.stringify for the tsbuildinfo
Again implementaion of suggestion by @sokra

* Update src/testRunner/unittests/tsbuild/helpers.ts

Co-authored-by: Tobias Koppers <[email protected]>

* Readable version of buildinfo all the time

* Some renames for readability as per feedback

Co-authored-by: Tobias Koppers <[email protected]>
@sheetalkamat
Copy link
Member

Closed with #43155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
For Uncommitted Bug PR for untriaged, rejected, closed or missing bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants