Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new BE API from intx #131

Merged
merged 3 commits into from
Aug 27, 2019
Merged

Use new BE API from intx #131

merged 3 commits into from
Aug 27, 2019

Conversation

chfast
Copy link
Member

@chfast chfast commented Aug 17, 2019

This is to see new intx API in usage. The intx PR: chfast/intx#107.

It also uses intx::addmod() and intx::mulmod(). Closes #110.

I didn't expect it to perform any better, but it does.

Skylake CPU:

Comparing bin/evmone-bench-master to bin/evmone-bench
Benchmark                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------
blake2b_huff/analysis             -0.0006         -0.0003            75            75            75            75
blake2b_huff/empty                -0.0003         -0.0001           111           111           111           111
blake2b_huff/abc                  +0.0013         +0.0015           111           111           111           111
blake2b_huff/2805nulls            -0.0008         -0.0006           800           800           800           799
blake2b_huff/2805aa               +0.0024         +0.0027           800           802           799           802
blake2b_huff/5610nulls            -0.0002         -0.0000          1488          1487          1487          1487
blake2b_huff/8415nulls            +0.0062         +0.0064          2134          2148          2133          2147
blake2b_huff/65536nulls           +0.0097         +0.0100         16031         16187         16024         16183
sha1_divs/analysis                +0.0026         +0.0028             8             8             8             8
sha1_divs/empty                   -0.2350         -0.2349           176           135           176           135
sha1_divs/1351                    -0.2606         -0.2604          3465          2562          3463          2561
sha1_divs/2737                    -0.2610         -0.2609          6739          4980          6736          4979
sha1_divs/5311                    -0.2624         -0.2623         13136          9689         13131          9687
sha1_divs/65536                   -0.2643         -0.2642        159813        117571        159747        117543
sha1_shifts/analysis              +0.0024         +0.0026             8             8             8             8
sha1_shifts/empty                 -0.0297         -0.0296            84            81            84            81
sha1_shifts/1351                  -0.0392         -0.0390          1554          1493          1553          1492
sha1_shifts/2737                  -0.0372         -0.0371          3014          2902          3013          2901
sha1_shifts/5311                  -0.0412         -0.0410          5870          5628          5868          5627
sha1_shifts/65536                 -0.0408         -0.0407         71357         68443         71332         68429
blake2b_shifts/analysis           -0.0005         -0.0004            39            39            39            39
blake2b_shifts/empty              +0.0000         +0.0000             0             0             0             0
blake2b_shifts/2805nulls          -0.0776         -0.0774          8530          7869          8527          7867
blake2b_shifts/5610nulls          -0.0766         -0.0765         17431         16096         17425         16093
blake2b_shifts/8415nulls          -0.0763         -0.0763         26139         24144         26130         24135
blake2b_shifts/65536nulls         -0.0726         -0.0724        205607        190683        205528        190644
stop/analysis                     -0.0050         -0.0049             0             0             0             0
stop                              +0.0200         +0.0201             2             2             2             2
weierstrudel/analysis             +0.0001         +0.0003            91            92            91            91
weierstrudel/0                    -0.0015         -0.0013           625           624           625           624
weierstrudel/1                    -0.0319         -0.0318          1215          1177          1215          1176
weierstrudel/2                    -0.0297         -0.0296          1530          1484          1529          1484
weierstrudel/3                    -0.0279         -0.0278          1841          1789          1840          1789
weierstrudel/8                    -0.0275         -0.0274          3377          3285          3376          3284
weierstrudel/9                    -0.0280         -0.0279          3689          3586          3688          3586
weierstrudel/14                   -0.0274         -0.0273          5232          5089          5231          5088

Haswell CPU:

Comparing bin/evmone-bench-master to bin/evmone-bench
Benchmark                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------
sha1_shifts/analysis              -0.0004         -0.0003             4             4             4             4
sha1_shifts/empty                 -0.0693         -0.0692            38            36            38            36
sha1_shifts/1351                  -0.0814         -0.0814           703           646           703           646
sha1_shifts/2737                  -0.0764         -0.0764          1364          1260          1364          1260
sha1_shifts/5311                  -0.0787         -0.0787          2663          2454          2663          2453
sha1_shifts/65536                 -0.0825         -0.0825         32287         29624         32286         29623
stop/analysis                     -0.0163         -0.0163             0             0             0             0
stop                              +0.0031         +0.0031             1             1             1             1
blake2b_huff/analysis             -0.0022         -0.0022            36            36            36            36
blake2b_huff/empty                -0.0038         -0.0038            52            52            52            52
blake2b_huff/abc                  +0.0027         +0.0027            52            53            52            53
blake2b_huff/2805nulls            -0.0037         -0.0037           365           364           365           364
blake2b_huff/2805aa               +0.0119         +0.0119           362           367           362           367
blake2b_huff/5610nulls            +0.0058         +0.0058           676           680           676           680
blake2b_huff/8415nulls            +0.0033         +0.0033           976           979           976           979
blake2b_huff/65536nulls           -0.0108         -0.0108          7372          7292          7371          7292
sha1_divs/analysis                -0.0094         -0.0094             4             4             4             4
sha1_divs/empty                   -0.2483         -0.2483            83            62            83            62
sha1_divs/1351                    -0.2809         -0.2809          1650          1187          1650          1187
sha1_divs/2737                    -0.2809         -0.2809          3199          2300          3199          2300
sha1_divs/5311                    -0.2814         -0.2814          6257          4496          6257          4496
sha1_divs/65536                   -0.2850         -0.2850         76471         54678         76469         54677
weierstrudel/analysis             -0.0276         -0.0276            44            43            44            43
weierstrudel/0                    -0.0024         -0.0024           290           290           290           290
weierstrudel/1                    -0.0539         -0.0539           567           536           567           536
weierstrudel/2                    -0.0475         -0.0475           710           676           710           676
weierstrudel/3                    -0.0622         -0.0622           863           809           863           809
weierstrudel/8                    -0.0596         -0.0596          1575          1481          1575          1481
weierstrudel/9                    -0.0560         -0.0559          1719          1623          1719          1623
weierstrudel/14                   -0.0613         -0.0613          2435          2286          2435          2286
blake2b_shifts/analysis           +0.0031         +0.0031            19            19            19            19
blake2b_shifts/empty              +0.0000         +0.0000             0             0             0             0
blake2b_shifts/2805nulls          -0.0537         -0.0536          3963          3750          3963          3750
blake2b_shifts/5610nulls          -0.0695         -0.0695          8150          7584          8150          7583
blake2b_shifts/8415nulls          -0.0748         -0.0748         12359         11435         12358         11434
blake2b_shifts/65536nulls         -0.0805         -0.0805         97883         89999         97878         89996

@codecov-io
Copy link

codecov-io commented Aug 17, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@197e560). Click here to learn what that means.
The diff coverage is 100%.

@@            Coverage Diff            @@
##             master     #131   +/-   ##
=========================================
  Coverage          ?   87.87%           
=========================================
  Files             ?       19           
  Lines             ?     2054           
  Branches          ?      218           
=========================================
  Hits              ?     1805           
  Misses            ?      224           
  Partials          ?       25

@chfast
Copy link
Member Author

chfast commented Aug 21, 2019

Added benchmark results

@chfast
Copy link
Member Author

chfast commented Aug 22, 2019

Anyone wants to review this?

@@ -103,7 +106,7 @@ void op_div(execution_state& state, instr_argument) noexcept
void op_sdiv(execution_state& state, instr_argument) noexcept
{
auto& v = state.stack[1];
v = v != 0 ? intx::sdivrem(state.stack[0], v).quot : 0;
v = v != 0 ? sdivrem(state.stack[0], v).quot : 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually like the explicit namespace because it improves readability in many places.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too, especially for things like be::store(..., ...) it would be clearer, that some conversion is going on

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I brought back the be::... but intx::be::load<intx::uint256> seems a bit too much...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intx::sdivrem() is never needed, because of ADL. I can bring it back in similar cases, but I cannot promise it will stay consistent over time.

@@ -139,7 +139,7 @@ void op_mulmod(execution_state& state, instr_argument) noexcept
const auto y = state.stack.pop();
auto& m = state.stack.top();

m = m != 0 ? ((uint512{x} * uint512{y}) % uint512{m}).lo : 0;
m = m != 0 ? mulmod(x, y, m) : 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any speed change with these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is.

@@ -629,17 +599,15 @@ void op_blockhash(execution_state& state, instr_argument) noexcept
auto upper_bound = state.host.get_tx_context().block_number;
auto lower_bound = std::max(upper_bound - 256, decltype(upper_bound){0});
auto n = static_cast<int64_t>(number);
auto header = evmc_bytes32{};
auto header = evmc::bytes32{};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be a somewhat irrelevant change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@@ -1066,16 +1018,12 @@ void op_create(execution_state& state, instr_argument arg) noexcept

msg.sender = state.msg->destination;
msg.depth = state.msg->depth + 1;
intx::be::store(msg.value.bytes, endowment);
msg.value = store<evmc::uint256be>(endowment);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other places you do it like store(msg.value.bytes, value);
I like this assignment form better, maybe change other places for consistency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@chfast
Copy link
Member Author

chfast commented Aug 26, 2019

Ok, I pushed a version which is much more verbose, but still ok in my opinion. We still keep using short uint256 (it was there before, but not used everywhere). Let me know what do you think.

@chfast chfast requested review from axic and gumb0 August 27, 2019 08:39
@chfast chfast merged commit b6da2b3 into master Aug 27, 2019
@chfast chfast deleted the intx branch August 27, 2019 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update to intx 0.3 and use addmod/mulmod?
4 participants