This repository has been archived by the owner on Aug 31, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 664
refactor(rome_js_parser): Streamline parser events #2327
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
!bench_parser |
MichaReiser
commented
Mar 30, 2022
@@ -57,7 +52,7 @@ pub fn process(sink: &mut impl TreeSink, mut events: Vec<Event>, errors: Vec<Par | |||
let mut forward_parents = Vec::new(); | |||
|
|||
for i in 0..events.len() { | |||
match mem::replace(&mut events[i], Event::tombstone(TextSize::default())) { | |||
match &mut events[i] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to replace the events with tombstone because the iterator is only moving forward from here
MichaReiser
commented
Mar 30, 2022
// append `A`'s forward_parent `B` | ||
fp = match mem::replace(&mut events[idx], Event::tombstone(TextSize::default())) | ||
{ | ||
fp = match mem::replace(&mut events[idx], Event::tombstone()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing is necessary here because we don't want to visit the start node of a forwarded parent again.
Parser conformance results on ubuntu-latestjs/262
jsx/babel
ts/babel
ts/microsoft
|
Deploying with Cloudflare Pages
|
Parser Benchmark Results
|
MichaReiser
changed the title
refactor(rome_js_parser): Refactor Parser Events
refactor(rome_js_parser): Streamline parser events
Mar 30, 2022
Reduce the size of a single parser event from 16 bytes to 8 bytes each by: * Using a `NonZeroU32` for the forward parent. The forward parent can never be 0 because it stores the offset from the current event to the start of the "forwarded" parent. * Store the `start` of a node in the `CompletedMarker` (can't be computed because of forward parents) * Remove `end` from the `Finish` event and instead retrieve the last token of the node when queried (mainly to produce diagnostics). * Only store the end offset for each Token instead of the full range. The end offset is sufficient to reconstruct the length in the tree sink. This reduces the memory consumption during the parse phase significantly: * `jquery`: * Current Bytes: 4.12 MB -> 2.12 MB * Max Bytes: 5.82 MB -> 3.82 MB * Total Bytes: 8.45 MB -> 4.37 MB * `tex-chtml-full` * Current bytes: 33.11 MB -> 17.11 MB * Max bytes: 46 MB -> 30 MB * Total bytes: 67.78 -> 34.92 MB It also reduces the max bytes required during the tree sink phase. The changes do improve parse times but not as much as I did hope for: ``` group event main ----- ----- ---- parser/checker.ts 1.00 63.6±1.84ms 40.9 MB/sec 1.00 63.8±0.45ms 40.8 MB/sec parser/compiler.js 1.00 36.3±0.77ms 28.9 MB/sec 1.03 37.5±0.38ms 27.9 MB/sec parser/d3.min.js 1.00 24.3±0.25ms 10.8 MB/sec 1.03 25.1±2.39ms 10.4 MB/sec parser/dojo.js 1.00 2.2±0.00ms 30.9 MB/sec 1.03 2.3±0.02ms 30.0 MB/sec parser/ios.d.ts 1.00 52.7±0.55ms 35.4 MB/sec 1.19 62.6±0.58ms 29.8 MB/sec parser/jquery.min.js 1.00 6.6±0.13ms 12.6 MB/sec 1.05 6.9±0.26ms 12.0 MB/sec parser/math.js 1.00 45.4±0.90ms 14.3 MB/sec 1.02 46.3±0.59ms 14.0 MB/sec parser/parser.ts 1.00 1525.9±16.73µs 31.7 MB/sec 1.02 1556.6±21.54µs 31.0 MB/sec parser/pixi.min.js 1.00 28.9±0.67ms 15.2 MB/sec 1.01 29.3±0.14ms 15.0 MB/sec parser/react-dom.production.min.js 1.00 9.0±0.01ms 12.7 MB/sec 1.02 9.2±0.05ms 12.5 MB/sec parser/react.production.min.js 1.00 466.9±1.03µs 13.2 MB/sec 1.03 481.5±3.49µs 12.8 MB/sec parser/router.ts 1.00 1186.9±8.65µs 50.4 MB/sec 1.03 1222.2±10.20µs 48.9 MB/sec parser/tex-chtml-full.js 1.00 60.5±0.68ms 15.1 MB/sec 1.10 66.4±1.53ms 13.7 MB/sec parser/three.min.js 1.00 32.1±0.24ms 18.3 MB/sec 1.03 33.0±0.43ms 17.8 MB/sec parser/typescript.js 1.00 279.9±4.87ms 33.9 MB/sec 1.04 292.2±2.93ms 32.5 MB/sec parser/vue.global.prod.js 1.00 11.4±0.34ms 10.6 MB/sec 1.01 11.5±0.03ms 10.5 MB/sec ``` ## Tests `cargo test`
MichaReiser
force-pushed
the
refactor/parser-events
branch
from
March 30, 2022 13:46
caabf9d
to
350dfe5
Compare
xunilrj
approved these changes
Mar 31, 2022
MichaReiser
added a commit
that referenced
this pull request
Apr 6, 2022
#2327 changed the parser to not store the last parsed token but instead retrieve it by getting the kind from the last `Token` event in the parser events. This didn't play well with skipping tokens because the parser didn't add any tokens to the `events` collection while in `skipping` mode. This PR changes the `skipping` mode so that it only changes whatever the parser calls `bump` or `skip_as_token_trivia` but the parser will always add it to the `events` collection. This adds a few unnecessary writes but this should be neglectable because * Skipping token is rare and even then, often limited to very few tokens * Moving the `bump` out of the condition may even help with branch prediction. Fixes #2358
MichaReiser
added a commit
that referenced
this pull request
Apr 6, 2022
#2327 changed the parser to not store the last parsed token but instead retrieve it by getting the kind from the last `Token` event in the parser events. This didn't play well with skipping tokens because the parser didn't add any tokens to the `events` collection while in `skipping` mode. This PR changes the `skipping` mode so that it only changes whatever the parser calls `bump` or `skip_as_token_trivia` but the parser will always add it to the `events` collection. This adds a few unnecessary writes but this should be neglectable because * Skipping token is rare and even then, often limited to very few tokens * Moving the `bump` out of the condition may even help with branch prediction. Fixes #2358
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reduce the size of a single parser event from 16 bytes to 8 bytes each by:
NonZeroU32
for the forward parent. The forward parent can never be 0 because it stores the offset from the current event to the start of the "forwarded" parent.start
of a node in theCompletedMarker
(can't be computed because of forward parents)end
from theFinish
event and instead retrieve the last token of the node when queried (mainly to produce diagnostics).This reduces the memory consumption during the parse phase significantly:
jquery
:tex-chtml-full
It also reduces the max bytes required during the tree sink phase.
The changes do improve parse times but not as much as I did hope for:
Tests
cargo test