Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change opt-level from 2 back to 3 #67878

Merged
merged 1 commit into from
Jan 31, 2020
Merged

Change opt-level from 2 back to 3 #67878

merged 1 commit into from
Jan 31, 2020

Conversation

Others
Copy link
Contributor

@Others Others commented Jan 4, 2020

In Cargo.toml, the opt-level for release and bench was overridden to be 2. This was to work around a problem with LLVM 7. However, rust no longer uses LLVM 7, so this is hopefully no longer needed?

I tried a little bit to replicate the original problem, and could not. I think running this through CI is the best way to smoke test this :) Even if things break dramatically, the comment should be updated to reflect that things are still broken with LLVM 9.

I'm just getting started playing with the compiler, so apologies if I've missed an obvious problem here.

fixes #52378

(possibly relevant is the current update to LLVM 10)

@rust-highfive
Copy link
Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 4, 2020
@Mark-Simulacrum
Copy link
Member

I suspect things are still broken; that comment has been around since I can remember. However, we might as well let this inch its way through the queue. @bors rollup=never r+

It's also I believe true that we've not really seen appreciable performance gains when doing this, so it may just generally not be worth it. (I am unsure of the costs in compile time of rustc itself, but would not be surprised if they're non-marginal). I suspect this will fail anyway.

r? @Mark-Simulacrum

@bors
Copy link
Contributor

bors commented Jan 5, 2020

📌 Commit c7f44b9352f04ec36109d074c5eacf7f6c42b22a has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 5, 2020
@mati865
Copy link
Contributor

mati865 commented Jan 5, 2020

Previously identical chenge has passed CI but made subsequent PRs fail spuriously. That makes this PR quite high risk.

@Mark-Simulacrum
Copy link
Member

I think enough time had passed that it may be worth trying again. However, one thing I did want to do but forgot - @bors r- try @rust-timer queue

Let's gather perf stats before landing.

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 5, 2020
@bors
Copy link
Contributor

bors commented Jan 5, 2020

⌛ Trying commit c7f44b9352f04ec36109d074c5eacf7f6c42b22a with merge c8c22433c5b3e436a1837ce84de8eacc5586de6d...

@bors
Copy link
Contributor

bors commented Jan 5, 2020

☀️ Try build successful - checks-azure
Build commit: c8c22433c5b3e436a1837ce84de8eacc5586de6d (c8c22433c5b3e436a1837ce84de8eacc5586de6d)

@rust-timer
Copy link
Collaborator

Queued c8c22433c5b3e436a1837ce84de8eacc5586de6d with parent 7785834, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit c8c22433c5b3e436a1837ce84de8eacc5586de6d, comparison URL.

@bjorn3
Copy link
Member

bjorn3 commented Jan 5, 2020

const_eval_raw is about 16% slower.

@Others
Copy link
Contributor Author

Others commented Jan 5, 2020

@bjorn3 I haven't used rust-timer before. Where are you pulling that number from?
I do see that the ctfe tests have some unfortunate high variance

@bjorn3
Copy link
Member

bjorn3 commented Jan 5, 2020

First click on the benchmark group (ctfe-stress-4-check), then click on the percentage corresponding to a specific benchmark (10.7% for clean) You will then go to a page which shows info for every query.

@Others
Copy link
Contributor Author

Others commented Jan 5, 2020

@bjorn3 ah, thanks!

Digging into the results, it seems that this change improves incremental speed by a few percent on quite a few of the test cases. This seems like a good initial sign! const_eval_raw is the only query I can find that significantly regressed.

I think this PR probably is worth landing, assuming the const_eval_raw regression can be mitigated. (And we try and ensure it's not going to cause spurious failures in the future.)

@Mark-Simulacrum How do you recommend I proceed here?

@Aaron1011
Copy link
Member

Aaron1011 commented Jan 5, 2020

I ran perf on rustc 1.42.0-nightly (119307a83 2019-12-31) and the try build for this PR (c8c22433c5b3e436a1837ce84de8eacc5586de6d) against ctfe-stress-4.

The slowdown appears to be coming from the body of const_eval_raw_provider. Here's the relevant snippets of annotated assembly:

Nightly:

  7.37 │      → callq  rustc_mir::interpret::step::<impl rustc_mir::interpret::eval_context::InterpCx<M>>::step                                                                                                                              ▒
 11.35 │        movzbl 0x20(%rsp),%edx                                                                                                                                                                                                       ▒
  6.80 │        movzbl 0x21(%rsp),%ecx                                                                                                                                                                                                       ▒
  1.14 │        movzbl 0x28(%rsp),%eax                                                                                                                                                                                                       ▒
  1.70 │        movups (%rbx),%xmm0                                                                                                                                                                                                          ▒
 11.94 │        movups 0x10(%rbx),%xmm1                                                                                                                                                                                                      ▒
  6.81 │        movups 0x20(%rbx),%xmm2                                                                                                                                                                                                      ▒
  6.85 │        movups 0x30(%rbx),%xmm3                                                                                                                                                                                                      ▒
       │        movaps %xmm0,0xa0(%rsp)                                                                                                                                                                                                      ▒
  9.11 │        movaps %xmm1,0xb0(%rsp)                                                                                                                                                                                                      ▒
  7.34 │        movaps %xmm2,0xc0(%rsp)                                                                                                                                                                                                      ▒
  5.11 │        movaps %xmm3,0xd0(%rsp)                                                                                                                                                                                                      ▒
       │        mov    0x3f(%rbx),%rsi                                                                                                                                                                                                       ▒
  9.66 │        mov    %rsi,0xdf(%rsp)                                                                                                                                                                                                       ▒
  5.15 │        cmp    $0x1,%dl                                                                                                                                                                                                              ▒

This PR:

  2.55 │      → callq  rustc_mir::interpret::step::<impl rustc_mir::interpret::eval_context::InterpCx<M>>::step                                                                                                                              ▒
  2.35 │        movups 0x130(%rsp),%xmm0                                                                                                                                                                                                     ▒
  0.10 │        movaps %xmm0,0x60(%rsp)                                                                                                                                                                                                      ▒
 67.80 │        movups 0xf0(%rsp),%xmm0                                                                                                                                                                                                      ▒
       │        movups 0x100(%rsp),%xmm1                                                                                                                                                                                                     ▒
  0.10 │        movups 0x110(%rsp),%xmm2                                                                                                                                                                                                     ▒
       │        movups 0x120(%rsp),%xmm3                                                                                                                                                                                                     ▒
  3.53 │        movaps %xmm3,0x50(%rsp)                                                                                                                                                                                                      ▒
       │        movaps %xmm2,0x40(%rsp)                                                                                                                                                                                                      ▒
       │        movaps %xmm1,0x30(%rsp)                                                                                                                                                                                                      ▒
  3.22 │        movaps %xmm0,0x20(%rsp)                                                                                                                                                                                                      ▒
 12.67 │        cmpb   $0x1,0x20(%rsp)   

With this branch, LLVM appears to be generating more SIMD instructions. However, the total percentage for the the various mov instructions seems to be about the same. The real difference seems to be the cmp instruction - on Nightly, it's comparing against a register. On this branch, it's performing a load from memory.

I think this corresponds to checking the discriminant of the Result returned by InterpCx::step.

I'm not 100% sure that this analysis is correct. However, it appears that part of this regression may be due to LLVM codegenning InterpCx::step such that the discriminant of the result is stored on the stack, rather than in a register.

EDIT: I didn't realize that %dl is part of %edx. The discriminant isn't actually stored directly in a register - it's loaded via movzbl 0x20(%rsp),%edx

@Others
Copy link
Contributor Author

Others commented Jan 5, 2020

@Aaron1011

Very interesting! I'm assuming that the expensive call chain is:

const_eval_raw_provider => eval_body_using_ecx => InterpCx::run => InterpCx::step

My reading of the assembly is that on nightly LLVM is loading some part of the return value at the start (from 0x20(%rsp)), then doing stuff, then branching on that part of the return value:

  7.37 │      → callq  rustc_mir::interpret::step::<impl rustc_mir::interpret::eval_context::InterpCx<M>>::step                                                                                                                              ▒
 11.35 │        movzbl 0x20(%rsp),%edx  # ret value loaded from 0x20(%rsp)                                                                                                                                                                                             ▒
[SNIP: instructions don't touch %edx]
  5.15 │        cmp    $0x1,%dl         # ret value used for comparison                                                                                                                                                                                                           ▒

(Note: 77.81 on movs, 5.15 on cmps)

After this patch, LLVM doesn't bother loading that part of the ret value till the end:

  2.55 │      → callq  rustc_mir::interpret::step::<impl rustc_mir::interpret::eval_context::InterpCx<M>>::step                                                                                                                              ▒
  2.35 │        movups 0x130(%rsp),%xmm0                                                                                                                                                                                                     ▒
[SNIP: instructions don't touch 0x20(%rsp)]
  3.22 │        movaps %xmm0,0x20(%rsp)   # ret value loaded from 0x20(%rsp)                                                                                                                                                                                                   ▒
 12.67 │        cmpb   $0x1,0x20(%rsp)    # ret value loaded _AGAIN_ from 0x20(%rsp)

(Note: 77.1 on movs, 12.67 on cmps)

I wonder if this double load is hurting pipelining or instruction parallelism in some subtle way. This just seems like unlucky codegen from LLVM.

Perhaps one idea is to mark InterpCx::step as inline or inline(always). step is only ever called from the tiny function run, but the direct call seems to indicate that it's not being inlined (perhaps because of the length of the implementation of step). That might churn the IR enough to give us better performing assembly. This would be an easy thing to try, and isn't even particularly hacky -- it seems step might want to be inline anyway. What do you think of that idea?

@Aaron1011
Copy link
Member

After this patch, LLVM doesn't bother loading that part of the ret value till the end:

I'm not sure that that's correct. I think LLVM is loading the return value first on both nightly and this branch. However, due to opt-level=3, it ended up using additional SIMD instructions on this branch. Note how the mov instruction totals are almost identical.

Marking step as inline seems like it could be worth a shot. However, I think it would be a good idea to try to determine why LLVM is no longer storing (what I think is) the discriminant in a register. t might just be a deliberate (but unfortunate) decision by LLVM, but it seems surprising to me that a higher optimization level should cause LLVM to avoid using a register.

@nagisa
Copy link
Member

nagisa commented Jan 5, 2020

I’m pretty confident there’s no real point in attempting to operate on a register here. The memory location in question ought to be in the L1 cache – it has been just written to the memory address in question after all.

@Others
Copy link
Contributor Author

Others commented Jan 6, 2020

Here’s my hypothesis:

Register is still faster than cache. The original code does the load from 0x20(%rsp) to %edx first. There isn’t a store to that address, so that can be totally pipelined with the vectorized movs. In the new code, the load from 0x20(%rsp) is only at the end — it cannot be pipelined properly.

@Others
Copy link
Contributor Author

Others commented Jan 6, 2020

Okay, I added an #[inline(always)]. Comparing naively versus nightly on my local machine, this seems to have fixed the regression. (Fingers crossed.)

@Mark-Simulacrum (or someone else with bors privileges), can we get another try + rust-timer run?

@nikic
Copy link
Contributor

nikic commented Jan 14, 2020

Does the opt-level here only affect the compiler itself or also libcore/libstd builds? If the latter, then that could result in a program size increase.

@Aaron1011
Copy link
Member

Thats a good point - I think it affects every crate. That's much less concerning - hopefully the size impact isn't too high.

@Mark-Simulacrum
Copy link
Member

This is an increase of 2,299 bytes which feels pretty significant to me, but I don't have a lot of context here. Certainly @jamesmunns has indicated to me I believe that this is pretty significant for embedded.

Could you try to investigate where the size increase is coming from? That is, is it spread out across all of libcore/std or a few functions getting significantly larger?

@pnkfelix
Copy link
Member

pnkfelix commented Jan 16, 2020

This is an increase of 2,299 bytes which feels pretty significant to me, but I don't have a lot of context here. Certainly @jamesmunns has indicated to me I believe that this is pretty significant for embedded.

Could you try to investigate where the size increase is coming from? That is, is it spread out across all of libcore/std or a few functions getting significantly larger?

(also, we might want to consider the possiblilty that the failures due to lack of disk and OOMs in the earlier crater run are somehow connected to the code size increase, no?)

@Mark-Simulacrum
Copy link
Member

I suspect no -- historically we've seen sporadic oom and disk full errors sporadically on entirely clean runs too.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed I-nominated S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 19, 2020
@Mark-Simulacrum
Copy link
Member

I would like to see us track down the size diff in that particular test before moving forward. twiggy or a similar tool run on the output from master and this branch should be helpful.

@JohnCSimon
Copy link
Member

Ping from triage: @Others this PR has sat idle for a week, can you please post your status?

@Others
Copy link
Contributor Author

Others commented Jan 26, 2020

Sorry, have had a couple deadlines outside of this work. Will get back to this ASAP

@Others
Copy link
Contributor Author

Others commented Jan 30, 2020

After investigating, I believe the difference is almost entirely in the size of the dlmalloc::dlmalloc::Dlmalloc::malloc function. I'm not sure why, but I can confirm that this is the only thing that's getting massively blown up.

Twiggy output pre this PR:

 Shallow Bytes │ Shallow % │ Item                                                                                                                                                                           
───────────────┼───────────┼───────────────────────────────────────────────────────────────────────────────                                                                                                 
          3518 ┊    17.26% ┊ foo                                                                                                                                                                            
          3436 ┊    16.86% ┊ dlmalloc::dlmalloc::Dlmalloc::malloc::h5191837f74aa063b                                                                                                                        
          2786 ┊    13.67% ┊ "function names" subsection                                                                                                                                                    
          1171 ┊     5.75% ┊ core::fmt::write::h0ce3b15765a50a38                                                                                                                                            
          1091 ┊     5.35% ┊ <&T as core::fmt::Display>::fmt::h8a37c3d9d7b52c10                                                                                                                             
          1017 ┊     4.99% ┊ dlmalloc::dlmalloc::Dlmalloc::free::he974a440dac0e703                                                                                                                          
           835 ┊     4.10% ┊ data[0]                                                                                                                                                                        
           791 ┊     3.88% ┊ core::fmt::Formatter::pad_integral::hda797e1e16145143                                                                                                                          
           785 ┊     3.85% ┊ __rdl_realloc                                                                                                                                                                  
           706 ┊     3.46% ┊ dlmalloc::dlmalloc::Dlmalloc::dispose_chunk::h2ab4f6a050146e3d                                                                                                                 
           481 ┊     2.36% ┊ data[1]                                                                                                                                                                        
           396 ┊     1.94% ┊ __rdl_alloc                                                                                                                                                                    
           365 ┊     1.79% ┊ <&mut W as core::fmt::Write>::write_char::he9f72744d97cef6e                                                                                                                    
           364 ┊     1.79% ┊ core::fmt::num::imp::fmt_u64::h6b8bd0148369b5f2                                                                                                                                
           332 ┊     1.63% ┊ dlmalloc::dlmalloc::Dlmalloc::insert_large_chunk::h9c2c0eddd01a2805                                                                                                            
           322 ┊     1.58% ┊ dlmalloc::dlmalloc::Dlmalloc::unlink_large_chunk::h971e5e4ddf7fb69e                                                                                                            
           205 ┊     1.01% ┊ alloc::raw_vec::RawVec<T,A>::reserve::h8b68b4c50816a33e                                                                                                                        
           162 ┊     0.79% ┊ core::result::unwrap_failed::hf27cc13e099ddfcf                                                                                                                                 
           132 ┊     0.65% ┊ core::panicking::panic_bounds_check::h281ad44a05ec4aee                                                                                                                         
           123 ┊     0.60% ┊ std::panicking::rust_panic_with_hook::h0cb0cfbfcae4d325                                                                                                                        
           118 ┊     0.58% ┊ <&mut W as core::fmt::Write>::write_fmt::h9257c36ec995818d                                                                                                                     
<SNIP>

After applying this PR:

ubuntu@ip-172-31-23-149:~/rust$ twiggy top /home/ubuntu/rust/build/x86_64-unknown-linux-gnu/test/run-make/wasm-stringify-ints-small/wasm-stringify-ints-small/foo.wasm                                      
 Shallow Bytes │ Shallow % │ Item                                                                                                                                                                           
───────────────┼───────────┼───────────────────────────────────────────────────────────────────────────────
          6256 ┊    25.41% ┊ dlmalloc::dlmalloc::Dlmalloc::malloc::h41b2f4cb262befbb
          3518 ┊    14.29% ┊ foo
          2642 ┊    10.73% ┊ "function names" subsection
          1974 ┊     8.02% ┊ dlmalloc::dlmalloc::Dlmalloc::free::h74082b2fc321a532
          1660 ┊     6.74% ┊ dlmalloc::dlmalloc::Dlmalloc::dispose_chunk::h194d309ddaa42708
          1158 ┊     4.70% ┊ core::fmt::write::h7a56f8be98585755
          1105 ┊     4.49% ┊ <&T as core::fmt::Display>::fmt::ha36b0b1ae11f94a9
          1094 ┊     4.44% ┊ __rdl_realloc
           831 ┊     3.37% ┊ data[0]
           791 ┊     3.21% ┊ core::fmt::Formatter::pad_integral::h2884a891601c317e
           481 ┊     1.95% ┊ data[1]
           402 ┊     1.63% ┊ __rdl_alloc
           365 ┊     1.48% ┊ <&mut W as core::fmt::Write>::write_char::h7c7a8ec042aeecbd
           364 ┊     1.48% ┊ core::fmt::num::imp::fmt_u64::h20314ec5886669d6
           205 ┊     0.83% ┊ alloc::raw_vec::RawVec<T,A>::reserve::h6b68995b555c2ec7
           162 ┊     0.66% ┊ core::result::unwrap_failed::haa7e7a77a7937368
           132 ┊     0.54% ┊ core::panicking::panic_bounds_check::ha6acab54c4f53c2d
           123 ┊     0.50% ┊ std::panicking::rust_panic_with_hook::h9c90ba24f7d72c9c
           118 ┊     0.48% ┊ <&mut W as core::fmt::Write>::write_fmt::h3f4eaeca77ac8b69
<SNIP>

Twiggy diff output:

 Delta Bytes │ Item
─────────────┼────────────────────────────────────────────────────────────────────
       +6256 ┊ dlmalloc::dlmalloc::Dlmalloc::malloc::h41b2f4cb262befbb
       -3436 ┊ dlmalloc::dlmalloc::Dlmalloc::malloc::h5191837f74aa063b
       +1974 ┊ dlmalloc::dlmalloc::Dlmalloc::free::h74082b2fc321a532
       +1660 ┊ dlmalloc::dlmalloc::Dlmalloc::dispose_chunk::h194d309ddaa42708
       -1171 ┊ core::fmt::write::h0ce3b15765a50a38
       +1158 ┊ core::fmt::write::h7a56f8be98585755
       +1105 ┊ <&T as core::fmt::Display>::fmt::ha36b0b1ae11f94a9
       -1091 ┊ <&T as core::fmt::Display>::fmt::h8a37c3d9d7b52c10
       -1017 ┊ dlmalloc::dlmalloc::Dlmalloc::free::he974a440dac0e703
        +791 ┊ core::fmt::Formatter::pad_integral::h2884a891601c317e
        -791 ┊ core::fmt::Formatter::pad_integral::hda797e1e16145143
        -706 ┊ dlmalloc::dlmalloc::Dlmalloc::dispose_chunk::h2ab4f6a050146e3d
        +365 ┊ <&mut W as core::fmt::Write>::write_char::h7c7a8ec042aeecbd
        -365 ┊ <&mut W as core::fmt::Write>::write_char::he9f72744d97cef6e
        +364 ┊ core::fmt::num::imp::fmt_u64::h20314ec5886669d6
        -364 ┊ core::fmt::num::imp::fmt_u64::h6b8bd0148369b5f2
        -332 ┊ dlmalloc::dlmalloc::Dlmalloc::insert_large_chunk::h9c2c0eddd01a2805
        -322 ┊ dlmalloc::dlmalloc::Dlmalloc::unlink_large_chunk::h971e5e4ddf7fb69e
        +309 ┊ __rdl_realloc
        +205 ┊ alloc::raw_vec::RawVec<T,A>::reserve::h6b68995b555c2ec7
        -347 ┊ ... and 61 more.
       +4245 ┊ Σ [81 Total Rows]

I kinda doubt this should be a blocker for the PR? Would love some more review feedback.

@Mark-Simulacrum
Copy link
Member

Interesting! Looks like the dlmalloc functions are somewhat big. Given that it's just the default allocator, and not actually mandatory (i.e. most wasm executables that care about size will likely switch to some alternative), I imagine that we can land this.

Looks like there's still a comment left to be added about the #[inline(always)] we've added and once that's in I believe we can merge.

In Cargo.toml, the opt-level for `release` and `bench` was
overridden to be 2. This was to work around a problem with LLVM
7. However, rust no longer uses LLVM 7, so this is no longer
needed.

This creates a small compile time regression in MIR constant eval,
so I've added a #[inline(always)] on the `step` function used in
const eval

Also creates a binary size increase in wasm-stringify-ints-small,
so I've bumped the limit there.
@Others
Copy link
Contributor Author

Others commented Jan 30, 2020

@Mark-Simulacrum I've added the comment :)

@Mark-Simulacrum
Copy link
Member

@bors r+ rollup=never

@bors
Copy link
Contributor

bors commented Jan 30, 2020

📌 Commit 0d52c56 has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 30, 2020
bors added a commit that referenced this pull request Jan 31, 2020
Change opt-level from 2 back to 3

In Cargo.toml, the opt-level for `release` and `bench` was overridden to be 2. This was to work around a problem with LLVM 7. However, rust no longer uses LLVM 7, so this is hopefully no longer needed?

I tried a little bit to replicate the original problem, and could not. I think running this through CI is the best way to smoke test this :) Even if things break dramatically, the comment should be updated to reflect that things are still broken with LLVM 9.

I'm just getting started playing with the compiler, so apologies if I've missed an obvious problem here.

fixes #52378

(possibly relevant is the [current update to LLVM 10](#67759))
@bors
Copy link
Contributor

bors commented Jan 31, 2020

⌛ Testing commit 0d52c56 with merge 138c50f...

@bors
Copy link
Contributor

bors commented Jan 31, 2020

☀️ Test successful - checks-azure
Approved by: Mark-Simulacrum
Pushing 138c50f to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 31, 2020
@bors bors merged commit 0d52c56 into rust-lang:master Jan 31, 2020
@rust-highfive
Copy link
Collaborator

📣 Toolstate changed by #67878!

Tested on commit 138c50f.
Direct link to PR: #67878

💔 rustc-guide on linux: test-pass → test-fail (cc @JohnTitor @amanjeev @spastorino @mark-i-m, @rust-lang/infra).

rust-highfive added a commit to rust-lang-nursery/rust-toolstate that referenced this pull request Jan 31, 2020
Tested on commit rust-lang/rust@138c50f.
Direct link to PR: <rust-lang/rust#67878>

💔 rustc-guide on linux: test-pass → test-fail (cc @JohnTitor @amanjeev @spastorino @mark-i-m, @rust-lang/infra).
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request May 16, 2020
Pkgsrc changes:
 * Bump rust bootstrap version to 1.42.0, except for Darwin/i686 where the
   bootstrap is not (yet?) available.

Upstream changes:

Version 1.43.0 (2020-04-23)
==========================

Language
--------
- [Fixed using binary operations with `&{number}` (e.g. `&1.0`) not having
  the type inferred correctly.][68129]
- [Attributes such as `#[cfg()]` can now be used on `if` expressions.][69201]

**Syntax only changes**
- [Allow `type Foo: Ord` syntactically.][69361]
- [Fuse associated and extern items up to defaultness.][69194]
- [Syntactically allow `self` in all `fn` contexts.][68764]
- [Merge `fn` syntax + cleanup item parsing.][68728]
- [`item` macro fragments can be interpolated into `trait`s, `impl`s,
  and `extern` blocks.][69366]
  For example, you may now write:
  ```rust
  macro_rules! mac_trait {
      ($i:item) => {
          trait T { $i }
      }
  }
  mac_trait! {
      fn foo() {}
  }
  ```
These are still rejected *semantically*, so you will likely receive an error but
these changes can be seen and parsed by macros and
conditional compilation.


Compiler
--------
- [You can now pass multiple lint flags to rustc to override the previous
  flags.][67885] For example; `rustc -D unused -A unused-variables` denies
  everything in the `unused` lint group except `unused-variables` which
  is explicitly allowed. However, passing `rustc -A unused-variables -D unused` denies
  everything in the `unused` lint group **including** `unused-variables` since
  the allow flag is specified before the deny flag (and therefore overridden).
- [rustc will now prefer your system MinGW libraries over its bundled libraries
  if they are available on `windows-gnu`.][67429]
- [rustc now buffers errors/warnings printed in JSON.][69227]

Libraries
---------
- [`Arc<[T; N]>`, `Box<[T; N]>`, and `Rc<[T; N]>`, now implement
  `TryFrom<Arc<[T]>>`,`TryFrom<Box<[T]>>`, and `TryFrom<Rc<[T]>>`
  respectively.][69538] **Note** These conversions are only available when `N`
  is `0..=32`.
- [You can now use associated constants on floats and integers directly, rather
  than having to import the module.][68952] e.g. You can now write `u32::MAX` or
  `f32::NAN` with no imports.
- [`u8::is_ascii` is now `const`.][68984]
- [`String` now implements `AsMut<str>`.][68742]
- [Added the `primitive` module to `std` and `core`.][67637] This module
  reexports Rust's primitive types. This is mainly useful in macros
  where you want avoid these types being shadowed.
- [Relaxed some of the trait bounds on `HashMap` and `HashSet`.][67642]
- [`string::FromUtf8Error` now implements `Clone + Eq`.][68738]

Stabilized APIs
---------------
- [`Once::is_completed`]
- [`f32::LOG10_2`]
- [`f32::LOG2_10`]
- [`f64::LOG10_2`]
- [`f64::LOG2_10`]
- [`iter::once_with`]

Cargo
-----
- [You can now set config `[profile]`s in your `.cargo/config`, or through
  your environment.][cargo/7823]
- [Cargo will now set `CARGO_BIN_EXE_<name>` pointing to a binary's
  executable path when running integration tests or benchmarks.][cargo/7697]
  `<name>` is the name of your binary as-is e.g. If you wanted the executable
  path for a binary named `my-program`you would use
  `env!("CARGO_BIN_EXE_my-program")`.

Misc
----
- [Certain checks in the `const_err` lint were deemed unrelated to const
  evaluation][69185], and have been moved to the `unconditional_panic` and
  `arithmetic_overflow` lints.

Compatibility Notes
-------------------

- [Having trailing syntax in the `assert!` macro is now a hard error.][69548]
  This has been a warning since 1.36.0.
- [Fixed `Self` not having the correctly inferred type.][69340] This incorrectly
  led to some instances being accepted, and now correctly emits a hard error.

[69340]: rust-lang/rust#69340

Internal Only
-------------
These changes provide no direct user facing benefits, but represent significant
improvements to the internals and overall performance of `rustc` and
related tools.

- [All components are now built with `opt-level=3` instead of `2`.][67878]
- [Improved how rustc generates drop code.][67332]
- [Improved performance from `#[inline]`-ing certain hot functions.][69256]
- [traits: preallocate 2 Vecs of known initial size][69022]
- [Avoid exponential behaviour when relating types][68772]
- [Skip `Drop` terminators for enum variants without drop glue][68943]
- [Improve performance of coherence checks][68966]
- [Deduplicate types in the generator witness][68672]
- [Invert control in struct_lint_level.][68725]

[67332]: rust-lang/rust#67332
[67429]: rust-lang/rust#67429
[67637]: rust-lang/rust#67637
[67642]: rust-lang/rust#67642
[67878]: rust-lang/rust#67878
[67885]: rust-lang/rust#67885
[68129]: rust-lang/rust#68129
[68672]: rust-lang/rust#68672
[68725]: rust-lang/rust#68725
[68728]: rust-lang/rust#68728
[68738]: rust-lang/rust#68738
[68742]: rust-lang/rust#68742
[68764]: rust-lang/rust#68764
[68772]: rust-lang/rust#68772
[68943]: rust-lang/rust#68943
[68952]: rust-lang/rust#68952
[68966]: rust-lang/rust#68966
[68984]: rust-lang/rust#68984
[69022]: rust-lang/rust#69022
[69185]: rust-lang/rust#69185
[69194]: rust-lang/rust#69194
[69201]: rust-lang/rust#69201
[69227]: rust-lang/rust#69227
[69548]: rust-lang/rust#69548
[69256]: rust-lang/rust#69256
[69361]: rust-lang/rust#69361
[69366]: rust-lang/rust#69366
[69538]: rust-lang/rust#69538
[cargo/7823]: rust-lang/cargo#7823
[cargo/7697]: rust-lang/cargo#7697
[`Once::is_completed`]: https://doc.rust-lang.org/std/sync/struct.Once.html#method.is_completed
[`f32::LOG10_2`]: https://doc.rust-lang.org/std/f32/consts/constant.LOG10_2.html
[`f32::LOG2_10`]: https://doc.rust-lang.org/std/f32/consts/constant.LOG2_10.html
[`f64::LOG10_2`]: https://doc.rust-lang.org/std/f64/consts/constant.LOG10_2.html
[`f64::LOG2_10`]: https://doc.rust-lang.org/std/f64/consts/constant.LOG2_10.html
[`iter::once_with`]: https://doc.rust-lang.org/std/iter/fn.once_with.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Setting opt-level = 3 will cause stage1-rustc to segfault on Linux