Skip to content
This repository has been archived by the owner on Aug 27, 2022. It is now read-only.

Add saxes parser to benchmark #12

Open
tsibahatau opened this issue Dec 23, 2020 · 2 comments
Open

Add saxes parser to benchmark #12

tsibahatau opened this issue Dec 23, 2020 · 2 comments

Comments

@tsibahatau
Copy link

tsibahatau commented Dec 23, 2020

Could you please consider to add https://github.com/lddubeau/saxes into benchmarks?

It shows overall good performance. Doesn't support streams direct piping or passing streams as an argument but can works in streaming mode in general sense.
As it won't work with current xml benchmark approach (parser.parse('<r>') - this line are not valid xml) I had to remove that line but this change broke the node-expat, so I removed it too :) Not sure why this line is dealbreaker.

My results:
sax x 11,466 ops/sec ±1.11% (88 runs sampled)
@tuananh/sax-parser x 24,570 ops/sec ±0.58% (89 runs sampled)
node-xml x 3,669 ops/sec ±0.86% (87 runs sampled)
ltx x 64,341 ops/sec ±0.29% (92 runs sampled)
saxes x 57,826 ops/sec ±0.33% (89 runs sampled)

@tuananh
Copy link
Owner

tuananh commented Dec 23, 2020

@tsibahatau i think i should archive this. it was just a fun side project.

great fork btw. how come saxes is so much faster than the original?

@tsibahatau
Copy link
Author

@tuananh Honestly I don't know.
saxes/Readme says nothing specific about this:
Saxes is much much faster than sax, mostly because of a substantial redesign of the internal parsing logic. The speed improvement is not merely due to removing features that were supported by sax. That helped a bit, but saxes adds some expensive checks in its aim for conformance with the XML specification. Redesigning the parsing logic is what accounts for most of the performance improvement.

Did a quick look into write() method implementation for both projects, seems sax uses switch(parser.state) with a lot of cases while saxes uses lookup tables. It can be significantly faster especially with huge number of case clauses https://stackoverflow.com/a/18830724/774620. https://jsbench.me/k4k7iqxeui you can test it for yourself with any number of case clauses.
Probably it wasn't the main improvement but at least it what can be seen from first look.

My overall impression - pure javascript parsers perform better in streaming mode rather than native backend because they don't have to switch from native code to javascript on every xml tag. Probably native parsers shines when you have to transform existing xml file to JSON/JS representation. While this transformation is certainly the working case for me, the requirement to have xml file fully downloaded suppress the performance gain.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants