Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get piece unchecked in write #83302

Merged
merged 6 commits into from
Aug 24, 2021
Merged

Conversation

camsteffen
Copy link
Contributor

@camsteffen camsteffen commented Mar 19, 2021

We already use specialized zip, but it seems like we can do a little better by not checking pieces length at all.

Arguments constructors are now unsafe. So the format_args! expansion now includes an unsafe block.

Local Bench Diff
 name                        before ns/iter  after ns/iter  diff ns/iter   diff %  speedup 
 fmt::write_str_macro1       22,967          19,718               -3,249  -14.15%   x 1.16 
 fmt::write_str_macro2       35,527          32,654               -2,873   -8.09%   x 1.09 
 fmt::write_str_macro_debug  571,953         575,973               4,020    0.70%   x 0.99 
 fmt::write_str_ref          9,579           9,459                  -120   -1.25%   x 1.01 
 fmt::write_str_value        9,573           9,572                    -1   -0.01%   x 1.00 
 fmt::write_u128_max         176             173                      -3   -1.70%   x 1.02 
 fmt::write_u128_min         138             134                      -4   -2.90%   x 1.03 
 fmt::write_u64_max          139             136                      -3   -2.16%   x 1.02 
 fmt::write_u64_min          129             135                       6    4.65%   x 0.96 
 fmt::write_vec_macro1       24,401          22,273               -2,128   -8.72%   x 1.10 
 fmt::write_vec_macro2       37,096          35,602               -1,494   -4.03%   x 1.04 
 fmt::write_vec_macro_debug  588,291         589,575               1,284    0.22%   x 1.00 
 fmt::write_vec_ref          9,568           9,732                   164    1.71%   x 0.98 
 fmt::write_vec_value        9,516           9,625                   109    1.15%   x 0.99 

@rust-highfive
Copy link
Collaborator

r? @sfackler

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 19, 2021
@joshtriplett
Copy link
Member

This seems a little surprising. Iterators like args.pieces.iter() should not need to do bounds checks, in general; they know their own bounds.

@camsteffen
Copy link
Contributor Author

This does not remove a bounds check within the loop. Only a length check of pieces when initializing ZipImpl.

@sfackler
Copy link
Member

Arguments can be safely (if unstably) constructed, so this is unsound as-is: https://doc.rust-lang.org/src/core/fmt/mod.rs.html#317-332

@camsteffen
Copy link
Contributor Author

I added assertions. I think that shouldn't be an issue, unless someone is using fmt_internals erroneously. Another local bench run was all the same or better.

@bors
Copy link
Contributor

bors commented Mar 27, 2021

☔ The latest upstream changes (presumably #83580) made this pull request unmergeable. Please resolve the merge conflicts.

@camsteffen camsteffen force-pushed the write-piece-unchecked branch 3 times, most recently from 5eab7b6 to e8ec0bb Compare April 6, 2021 14:42
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@camsteffen
Copy link
Contributor Author

camsteffen commented Apr 6, 2021

Now any unsafe block like format_args!("..", unsafe { .. }) is redundant which is causing test failures. To fix this, I think I would have to change format_args! to expand to, for example,

// format_args!("hello {}", world)
match match (&world,) {
    (arg0,) => [::core::fmt::ArgumentV1::new(
        arg0,
        ::core::fmt::Display::fmt,
    )],
} {
    ref args => unsafe { ::core::fmt::Arguments::new_v1(&["hello "], args) },
};

Thoughts?

@crlf0710
Copy link
Member

@camsteffen Ping from triage, CI is still red here. Maybe you can use any change necessary to get CI green, so we can pass this on to the reviewer?

@crlf0710 crlf0710 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 23, 2021
@rust-log-analyzer

This comment has been minimized.

@camsteffen

This comment has been minimized.

@camsteffen

This comment has been minimized.

@camsteffen

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@camsteffen camsteffen requested a review from sfackler May 1, 2021 02:29
@JohnCSimon
Copy link
Member

camsteffen requested a review from sfackler 16 days ago

@rustbot label: +S-waiting-on-review -S-waiting-on-author

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Aug 24, 2021
@bors bors merged commit de42550 into rust-lang:master Aug 24, 2021
@rustbot rustbot added this to the 1.56.0 milestone Aug 24, 2021
@camsteffen camsteffen deleted the write-piece-unchecked branch August 24, 2021 01:44
@rylev
Copy link
Member

rylev commented Sep 1, 2021

This caused a large regression in instruction counts when testing the compiler. This is a largely a change in std lib code, and we don't really have a good process for dealing with how std lib changes effect performance.

This seems to be primarily affecting debug and check builds, but there doesn't seem to be a query that is clearly to blame here. Given the motivation of this PR is primarily performance, I think it deserves a closer look.

As part of the performance triage process, I'm marking this as a performance regression. If you believe this performance regression to be justifiable or once you have an issue or PR that addresses this regression, please mark this PR with the label perf-regression-triaged.

@rustbot label +perf-regression

@rustbot rustbot added the perf-regression Performance regression. label Sep 1, 2021
flip1995 pushed a commit to flip1995/rust that referenced this pull request Sep 3, 2021
…tolnay

Get piece unchecked in `write`

We already use specialized `zip`, but it seems like we can do a little better by not checking `pieces` length at all.

`Arguments` constructors are now unsafe. So the `format_args!` expansion now includes an `unsafe` block.

<details>
<summary>Local Bench Diff</summary>

```text
 name                        before ns/iter  after ns/iter  diff ns/iter   diff %  speedup
 fmt::write_str_macro1       22,967          19,718               -3,249  -14.15%   x 1.16
 fmt::write_str_macro2       35,527          32,654               -2,873   -8.09%   x 1.09
 fmt::write_str_macro_debug  571,953         575,973               4,020    0.70%   x 0.99
 fmt::write_str_ref          9,579           9,459                  -120   -1.25%   x 1.01
 fmt::write_str_value        9,573           9,572                    -1   -0.01%   x 1.00
 fmt::write_u128_max         176             173                      -3   -1.70%   x 1.02
 fmt::write_u128_min         138             134                      -4   -2.90%   x 1.03
 fmt::write_u64_max          139             136                      -3   -2.16%   x 1.02
 fmt::write_u64_min          129             135                       6    4.65%   x 0.96
 fmt::write_vec_macro1       24,401          22,273               -2,128   -8.72%   x 1.10
 fmt::write_vec_macro2       37,096          35,602               -1,494   -4.03%   x 1.04
 fmt::write_vec_macro_debug  588,291         589,575               1,284    0.22%   x 1.00
 fmt::write_vec_ref          9,568           9,732                   164    1.71%   x 0.98
 fmt::write_vec_value        9,516           9,625                   109    1.15%   x 0.99
```
</details>
@camsteffen
Copy link
Contributor Author

camsteffen commented Sep 3, 2021

The regression seems to be caused by the change in the format_args! expansion which is now more complex in order to have an unsafe block that surrounds Arguments::new_v1 but does not surround the format_args! macro input. I reverted that particular change as an experiment in #88571 and that seemed to fix the perf regression. I don't know how to get a closer look at what is going on with style-servo-debug and would be interested in any pointers.

@Mark-Simulacrum
Copy link
Member

Hm. So #88571 does look to fix the majority of the regression here, but I suppose we cannot land it as-is due to the unsafe nesting causing some problems with spurious lints.

I would lean towards reverting this PR; the small wins in runtime performance do not seem worth it given that fmt isn't optimized for performance anyway (indirect calls, can't get inlined, etc.). The compile time regressions are not insignificant.

@camsteffen
Copy link
Contributor Author

My only reservation with reverting is that the internal fmt API was already unsafe before this PR added the unsafe modifiers.

@Mark-Simulacrum
Copy link
Member

I.. sort of doubt that the get_unchecked there matters that much either, so I'd at least be fine with dropping it too (in favor of safety) as part of a pseudo-revert.

@Mark-Simulacrum
Copy link
Member

Looks like the justification is noted in the commit (so nice!) -- d80f127. I think that makes sense, but we could likely get away with something "not wrong" as a result without invoking the full panic machinery. e.g., get().unwrap_or("") may work OK in some cases.

We could also move the unsafety as an assertion into a ZST that the caller must construct via an unsafe method, which would get compiled out but also allow us to put the unsafe block in a more local place, rather than wrapping the whole function call.

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 23, 2021
flip1995 pushed a commit to flip1995/rust-clippy that referenced this pull request Sep 28, 2021
@camsteffen
Copy link
Contributor Author

@Mark-Simulacrum I'm thinking the next step here for perf triage is to do a perf run reverting this and the follow-up PRs?

@Mark-Simulacrum
Copy link
Member

I think #89139 definitely helped with some of the regressions caused by this PR. I think a trial revert would still be good, though, to estimate how much of a performance hit this is with the mitigations in place (like ZST-based unsafe).

@camsteffen
Copy link
Contributor Author

The revert perf run looks good! #89521 (comment)

@Mark-Simulacrum
Copy link
Member

Hm, the revert results feel a little unexpected. Unless there's been unrelated performance optimization that mitigates much of the regression seen in this PR, we would expect the revert to look equivalent, right? But instead we see barely any improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.