Rollup of 7 pull requests #88881

Manishearth · 2021-09-12T10:45:00Z

Successful merges:

Detect stricter constraints on gats where clauses in impls vs trait #88336 ( Detect stricter constraints on gats where clauses in impls vs trait)
rustc: Remove local variable IDs from Exports #88677 (rustc: Remove local variable IDs from Exports)
Remove extra unshallow from cherry-pick checker #88699 (Remove extra unshallow from cherry-pick checker)
generic_const_exprs: use thir for abstract consts instead of mir #88709 (generic_const_exprs: use thir for abstract consts instead of mir)
Rework DepthFirstSearch API #88711 (Rework DepthFirstSearch API)
rustdoc: Cleanup clean part 1 #88810 (rustdoc: Cleanup clean part 1)
explicitly link to external ena docs #88813 (explicitly link to external ena docs)

Failed merges:

r? @ghost
@rustbot modify labels: rollup

backport-of: none

This expands the API to be more flexible, allowing for more visitation patterns on graphs. This will be useful to avoid extra datasets (and allocations) in cases where the expanded DFS API is sufficient. This also fixes a bug with the previous DFS constructor, which left the start node not marked as visited (even though it was immediately returned).

They can be obtained by accessing the `TyCtxt` where they are needed.

Local variables can never be exported.

The order of the `where` bounds on auto trait impls changed because rustdoc currently sorts auto trait `where` bounds based on the `Debug` output for the bound. Now that the bounds have an actual `Res`, they are being unintentionally sorted by their `DefId` rather than their path. So, I had to update a test for the change in ordering of the rendered bounds.

If the path is for a trait, it is always true that `trait_did == Some(did)`, so instead, `external_path()` now takes an `is_trait` boolean.

…estebank Detect stricter constraints on gats where clauses in impls vs trait I might try to see if I can do a bit more to improve these diagnostics, but any initial feedback is appreciated. I can also do any additional work in a followup PR. r? `@estebank`

rustc: Remove local variable IDs from `Export`s Local variables can never be exported.

…, r=pietroalbini Remove extra unshallow from cherry-pick checker This is already done by https://github.com/rust-lang/rust/blob/13db8440bbbe42870bc828d4ec3e965b38670277/src/ci/init_repo.sh#L32-L36 on the beta channel, and git throws an error if you attempt to unshallow an already non-shallow repository. r? ```@pietroalbini```

generic_const_exprs: use thir for abstract consts instead of mir Changes `AbstractConst` building to use `thir` instead of `mir` so that there's less chance of consts unifying when they shouldn't because lowering to mir dropped information (see `abstract-consts-as-cast-5.rs` test) r? `@lcnr`

…h726 Rework DepthFirstSearch API This expands the API to be more flexible, allowing for more visitation patterns on graphs. This will be useful to avoid extra datasets (and allocations) in cases where the expanded DFS API is sufficient. This also fixes a bug with the previous DFS constructor, which left the start node not marked as visited (even though it was immediately returned). Commit written by ```@nikomatsakis``` originally, cherry picked from several commits in work on never type stabilization, but stands alone.

rustdoc: Cleanup `clean` part 1 Split out from rust-lang#88379. These commits are completely independent of each other, and each is a fairly small change (the last few are new commits; they are not from rust-lang#88379): - Remove unnecessary `Cache.*_did` fields - rustdoc: Get symbol for `TyParam` directly - Create a valid `Res` in `external_path()` - Remove unused `hir_id` parameter from `resolve_type` - Fix redundant arguments in `external_path()` - Remove unnecessary `is_trait` argument - rustdoc: Cleanup a pattern match in `external_generic_args()` r? ``@jyn514``

explicitly link to external `ena` docs we currently do not link to the docs of `ena`: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/struct.InferCtxtInner.html#method.const_unification_table

Manishearth · 2021-09-12T10:45:10Z

@bors r+ p=1

bors · 2021-09-12T10:45:11Z

📌 Commit 146aee6 has been approved by Manishearth

bors · 2021-09-12T13:29:59Z

⌛ Testing commit 146aee6 with merge c7dbe7a...

bors · 2021-09-12T16:23:36Z

☀️ Test successful - checks-actions
Approved by: Manishearth
Pushing c7dbe7a to master...

rust-timer · 2021-09-12T18:04:24Z

Finished benchmarking commit (c7dbe7a): comparison url.

Summary: This change led to large relevant regressions 😿 in compiler performance.

Large regression in instruction counts (up to 2.1% on full builds of inflate)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression

Manishearth · 2021-09-12T18:51:14Z

It feels like #88709 is the only likely culprit but it's also feature gated. Unsure what's going on here

Manishearth · 2021-09-12T22:45:20Z

could also be #88677 or #88711 but everyone seems to find all of these candidates unlikely

Mark-Simulacrum · 2021-10-03T18:26:31Z

Focusing in on the top regression, inflate check full, cachegrind shows the regression centering on process_obligations:

$ ./target/release/collector diff_local cachegrind +9ef27bf7dc50a8b51435579b4f2e86f7ee3f7a94 +c7dbe7a830100c70d59994fd940bf75bb6e39b39 --include inflate --builds Check --runs Full
$ less results/cgann-9ef27bf7dc50a8b51435579b4f2e86f7ee3f7a94-c7dbe7a830100c70d59994fd940bf75bb6e39b39-inflate-Check-Full
72,907,279  ???:rustc_data_structures::obligation_forest::ObligationForest<O>::process_obligations
23,931,248  ???:ena::unify::UnificationTable<S>::uninlined_get_root_key

From there, we want to check if the function is called more often or each call became more expensive. For this, callgrind is useful. Rerun the diff_local with callgrind:

$ ./target/release/collector diff_local callgrind +9ef27bf7dc50a8b51435579b4f2e86f7ee3f7a94 +c7dbe7a830100c70d59994fd940bf75bb6e39b39 --include inflate --builds Check --runs Full

Then, the default view doesn't show call counts, but you can pass --tree=caller to callgrind_annotate to get this:

$ callgrind_annotate --tree=caller results/clgout-9ef27bf7dc50a8b51435579b4f2e86f7ee3f7a94-inflate-Check-Full
2,960,080,594 (64.18%)  < ???:_ZN21rustc_trait_selection6traits7fulfill18FulfillmentContext6select17h80d1450895e486aeE.llvm.13260143977094445196 (34,057x)
2,220,443,052 (48.14%)  *  ???:rustc_data_structures::obligation_forest::ObligationForest<O>::process_obligations

  404,736,181 ( 8.78%)  < ???:rustc_data_structures::obligation_forest::ObligationForest<O>::process_obligations (8,826x)
  201,820,150 ( 4.38%)  *  ???:rustc_data_structures::obligation_forest::ObligationForest<O>::compress

$ callgrind_annotate --tree=caller results/clgout-c7dbe7a830100c70d59994fd940bf75bb6e39b39-inflate-Check-Full | less
3,056,791,553 (64.92%)  < ???:_ZN21rustc_trait_selection6traits7fulfill18FulfillmentContext6select17h80d1450895e486aeE.llvm.9518623118464054305 (34,057x)
2,293,350,331 (48.71%)  *  ???:rustc_data_structures::obligation_forest::ObligationForest<O>::process_obligations

  404,659,331 ( 8.59%)  < ???:rustc_data_structures::obligation_forest::ObligationForest<O>::process_obligations (8,826x)
  201,740,042 ( 4.28%)  *  ???:rustc_data_structures::obligation_forest::ObligationForest<O>::compress

We look at the numbers in (...) after the callers. In this case, they're the same, so each function got a little more expensive -- with thousands of calls, even a slight difference in instruction counts can blow up to a big change. No direct changes were made in the source code though.

There is an average of 2,840 more instructions per call (inclusive of functions called by process obligations), but it's not really clear what the cause is. A cursory inspection suggests that the new code has a larger stack allocation (up by 32 bytes), so presumably there's more load/spills... but the reason for the larger stack is unclear. It seems plausible that this is due to shifts in optimization choices made through PGO that in this case were "bad". Inflate is one of the benchmarks run in CI for PGO, but it's still possible for that to produce negative results. Reading the assembly to figure out what is causing the loads and spills without debuginfo doesn't seem worthwhile to invest into right now (and reproducing the build with debuginfo is probably hard).

My guess is that shifts elsewhere in the program slightly shifted the profile for this function or something along those lines and that caused this shift. It seems quite possible that no particular PR is actually fully responsible for this change.

Mark-Simulacrum · 2021-10-03T18:27:40Z

FWIW while looking into this I also filed #89495 which may be of some help, but it's not a mitigation for this regression, just a sideline optimization.

Mark-Simulacrum and others added 30 commits September 6, 2021 13:17

Do not unshallow -- already done by other code

76e09eb

backport-of: none

Detect stricter constraints on gats where clauses in impls vs trait

af9de99

Fix duplicate error

890de33

WIP state

2987f4b

as casts and block exprs

9b29138

dont support blocks

4483c2b

bless stderr

47b16f4

tidy

c170dcf

move thir visitor to rustc_middle

08e8644

dont build abstract const for monomorphic consts

fc63e9a

handle ExprKind::NeverToAny

4cbcb09

remove WorkNode

1f57f8b

remove debug stmts

15101c8

rename mir -> thir around abstract consts

406d2ab

remove comment

79be080

nits

955e2b2

add a CastKind to Node::Cast

8c7954d

resolve from_hir_call FIXME

3212734

fmt

cd2915e

CI please

fd9bb30

add test for builtin types N + N unifying with fn call

8295e4a

Remove unnecessary Cache.*_did fields

44e6f2e

They can be obtained by accessing the `TyCtxt` where they are needed.

rustc: Remove local variable IDs from Exports

294510e

Local variables can never be exported.

explicitly link to external ena docs

03f9fe2

Only take tcx when it's all that's needed

df281ee

rustdoc: Get symbol for TyParam directly

0bb1c28

Remove unused hir_id parameter from resolve_type

c2207f5

Fix redundant arguments in external_path()

5321b35

If the path is for a trait, it is always true that `trait_did == Some(did)`, so instead, `external_path()` now takes an `is_trait` boolean.

Manishearth added 7 commits September 12, 2021 03:44

Rollup merge of rust-lang#88677 - petrochenkov:exportid, r=davidtwco

bb5ca58

rustc: Remove local variable IDs from `Export`s Local variables can never be exported.

Rollup merge of rust-lang#88813 - lcnr:ena-docs, r=jyn514

146aee6

explicitly link to external `ena` docs we currently do not link to the docs of `ena`: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_infer/infer/struct.InferCtxtInner.html#method.const_unification_table

rustbot added the rollup A PR which is a rollup label Sep 12, 2021

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Sep 12, 2021

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 12, 2021

bors merged commit c7dbe7a into rust-lang:master Sep 12, 2021

rustbot added this to the 1.57.0 milestone Sep 12, 2021

This was referenced Sep 12, 2021

extend simplify_type #86986

Merged

Prefer suggestion paths which are not doc-hidden #87349

Closed

rustbot added the perf-regression Performance regression. label Sep 12, 2021

Manishearth deleted the rollup-alohfwx branch September 12, 2021 18:45

Manishearth mentioned this pull request Sep 12, 2021

generic_const_exprs: use thir for abstract consts instead of mir #88709

Merged

This was referenced Sep 12, 2021

rustc: Remove local variable IDs from Exports #88677

Merged

Rework DepthFirstSearch API #88711

Merged

Mark-Simulacrum added the perf-regression-triaged The performance regression has been triaged. label Oct 3, 2021

Mark-Simulacrum mentioned this pull request Oct 5, 2021

Introduce let...else #87688

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rollup of 7 pull requests #88881

Rollup of 7 pull requests #88881

Manishearth commented Sep 12, 2021

Manishearth commented Sep 12, 2021

bors commented Sep 12, 2021

bors commented Sep 12, 2021

bors commented Sep 12, 2021

rust-timer commented Sep 12, 2021

Manishearth commented Sep 12, 2021

Manishearth commented Sep 12, 2021

Mark-Simulacrum commented Oct 3, 2021

Mark-Simulacrum commented Oct 3, 2021

Rollup of 7 pull requests #88881

Rollup of 7 pull requests #88881

Conversation

Manishearth commented Sep 12, 2021

Manishearth commented Sep 12, 2021

bors commented Sep 12, 2021

bors commented Sep 12, 2021

bors commented Sep 12, 2021

rust-timer commented Sep 12, 2021

Manishearth commented Sep 12, 2021

Manishearth commented Sep 12, 2021

Mark-Simulacrum commented Oct 3, 2021

Mark-Simulacrum commented Oct 3, 2021