Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

O(log n) lookup of associated items by name #69072

Merged
merged 5 commits into from
Feb 21, 2020

Conversation

ecstatic-morse
Copy link
Contributor

@ecstatic-morse ecstatic-morse commented Feb 11, 2020

Resolves #68957, in which compile time is quadratic in the number of associated items. This PR makes name lookup use binary search instead of a linear scan to improve its asymptotic performance. As a result, the pathological case from that issue now runs in 8 seconds on my local machine, as opposed to many minutes on the current stable.

Currently, method resolution must do a linear scan through all associated items of a type to find one with a certain name. This PR changes the result of the associated_items query to a data structure that preserves the definition order of associated items (which is used, e.g., for the layout of trait object vtables) while adding an index of those items sorted by (unhygienic) name. When doing name lookup, we first find all items with the same Symbol using binary search, then run hygienic comparison to find the one we are looking for. Ideally, this would be implemented using an insertion-order preserving, hash-based multi-map, but one is not readily available.

Someone who is more familiar with identifier hygiene could probably make this better by auditing the uses of the AssociatedItems interface. My goal was to preserve the current behavior exactly, even if it seemed strange (I left at least one FIXME to this effect). For example, some places use comparison with ident.modern() and some places use tcx.hygienic_eq which requires the DefId of the containing impl. I don't know whether those approaches are equivalent or which one should be preferred.

@ecstatic-morse

This comment has been minimized.

@rust-timer

This comment has been minimized.

@bors

This comment has been minimized.

@ecstatic-morse
Copy link
Contributor Author

Hopefully the up-front cost of sorting the associated items by name is outweighed by more efficient lookup. If this is not the case, I'll need to switch to an adaptive data structure or consider an out-of-band index that is interned separately.

@petrochenkov petrochenkov self-assigned this Feb 11, 2020
@petrochenkov petrochenkov added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 11, 2020
@rust-highfive

This comment has been minimized.

@ecstatic-morse

This comment has been minimized.

@ecstatic-morse
Copy link
Contributor Author

@bors try
@rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Feb 11, 2020

⌛ Trying commit eb27aaca6450605ed78c29224c4351157357ffd8 with merge 8dc7de89edae6b4f434280089e1a14a36a1ab405...

@bors
Copy link
Contributor

bors commented Feb 12, 2020

☀️ Try build successful - checks-azure
Build commit: 8dc7de89edae6b4f434280089e1a14a36a1ab405 (8dc7de89edae6b4f434280089e1a14a36a1ab405)

@rust-timer
Copy link
Collaborator

Queued 8dc7de89edae6b4f434280089e1a14a36a1ab405 with parent 3f32e30, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 8dc7de89edae6b4f434280089e1a14a36a1ab405, comparison URL.

@mati865
Copy link
Contributor

mati865 commented Feb 12, 2020

FYI syn-opt is nondeterministic right now: #69060

@eddyb
Copy link
Member

eddyb commented Feb 12, 2020

Ideally, this would be implemented using an insertion-order preserving, hash-based multi-map, but one is not readily available.

It's not a multi-map, but IndexMap<K, V> is insertion-order preserving and I think we should use it more (see #60608).

@ecstatic-morse
Copy link
Contributor Author

ecstatic-morse commented Feb 12, 2020

Looking at check builds, this PR is a slight regression. However, I think it is an acceptable one. I think we should merge this as-is to fix #68957.

Those perf results suggest that name lookup is not a significant component of compile time for most crates, so I probably won't spend much time optimizing this PR. There are many improvements to be made if others are interested. Since the purpose of in_definition_order was to serve as a greppable marker of code that relied on associated item definition order and I don't plan to try to eliminate that code anytime soon, maybe I should replace in_definition_order and its documentation with an iter method?

@Mark-Simulacrum
Copy link
Member

I would not replace in_definition_order (an incredibly useful name!) with iter at this point.

@estebank
Copy link
Contributor

I'm +1 on the approach: there's trivial perf degradation but the behavior in pathological cases is much more sane than it is now. I'll review more in depth soon.

cc @rust-lang/compiler for visibility.

Comment on lines 294 to 322
// FIXME(ecstaticmorse): Audit uses of this method to see if the definition order is actually
// significant in that context (perhaps we could replace them with calls to a newly defined
// `in_arbitrary_order`?). Once all uses have been removed, `AssociatedItems` won't have to
// keep a separate sorted list of indexes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be accomplished by actually randomizing the order of items_in_def_order and see what breaks?

@ecstatic-morse
Copy link
Contributor Author

ecstatic-morse commented Feb 13, 2020

edited for clarity

I added an in_arbitrary_order method. For now, it's only used in a few places where it cannot possibly affect error message order.

I would like to only use in_definition_order for code that relies on the order for correctness, not just good diagnostics. Many of these places are looking up associated items by index on various lang items. IMO, we should make these methods and associated types lang items as well. AFAICT, the current machinery does not support this.

To migrate away from in_definition_order, we could have in_arbitrary_order return things in non-definition order only during testing. That way, users would continue to get errors in definition order, but we could find out if code was implicitly depending on the definition order of associated items. UI tests complicate this, however.

@ecstatic-morse ecstatic-morse changed the title O(log n) lookup of associated items by name O(log n) lookup of associated items by name Feb 13, 2020
@Mark-Simulacrum
Copy link
Member

We do have -Zui-testing I believe for the ability to change to "deterministic" mode.

Most places that would actually break if the definition order changed are looking up associated items on lang items. IMO, these associated items should themselves be lang items, but AFAICT we don't support that yet.

cc @Centril -- this seems like something we could O(n) process in lang item collection and store HirIds or indices (whatever identified an associated item, I forget) like that for.

@Centril
Copy link
Contributor

Centril commented Feb 13, 2020

cc @Centril -- this seems like something we could O(n) process in lang item collection and store HirIds or indices (whatever identified an associated item, I forget) like that for.

cc @matthewjasper who has taken over my work on lang items in lowering and might have this more in cache.

@ecstatic-morse
Copy link
Contributor Author

ecstatic-morse commented Feb 14, 2020

Some unrelated changes have dramatically sped up the pathological case in #68957 on recent nightlies. A local build of a recent master now takes 30 seconds, down from several minutes on 1.41.0. This PR is now just a 2x speedup instead of a 10x one.

src/librustc_ty/ty.rs Outdated Show resolved Hide resolved
@petrochenkov
Copy link
Contributor

petrochenkov commented Feb 19, 2020

FWIW, I prefer in_definition_order, since it's self-documenting.

Ok, it's still a single method whether it's called iter or in_definition_order.

r=me with the nit (#69072 (comment)) addressed and green CI.

@petrochenkov petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 19, 2020
@petrochenkov
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Feb 20, 2020

📌 Commit a0212ba has been approved by petrochenkov

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 20, 2020
@bors
Copy link
Contributor

bors commented Feb 20, 2020

⌛ Testing commit a0212ba with merge d3ebd59...

@bors
Copy link
Contributor

bors commented Feb 21, 2020

☀️ Test successful - checks-azure
Approved by: petrochenkov
Pushing d3ebd59 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 21, 2020
@bors bors merged commit d3ebd59 into rust-lang:master Feb 21, 2020
@rust-highfive
Copy link
Collaborator

📣 Toolstate changed by #69072!

Tested on commit d3ebd59.
Direct link to PR: #69072

💔 clippy-driver on windows: test-fail → build-fail (cc @mcarton @oli-obk @Manishearth @flip1995 @yaahc @phansch @llogiq, @rust-lang/infra).
💔 clippy-driver on linux: test-fail → build-fail (cc @mcarton @oli-obk @Manishearth @flip1995 @yaahc @phansch @llogiq, @rust-lang/infra).

rust-highfive added a commit to rust-lang-nursery/rust-toolstate that referenced this pull request Feb 21, 2020
Tested on commit rust-lang/rust@d3ebd59.
Direct link to PR: <rust-lang/rust#69072>

💔 clippy-driver on windows: test-fail → build-fail (cc @mcarton @oli-obk @Manishearth @flip1995 @yaahc @phansch @llogiq, @rust-lang/infra).
💔 clippy-driver on linux: test-fail → build-fail (cc @mcarton @oli-obk @Manishearth @flip1995 @yaahc @phansch @llogiq, @rust-lang/infra).
krishna-veerareddy added a commit to krishna-veerareddy/rust-clippy that referenced this pull request Feb 21, 2020
krishna-veerareddy added a commit to krishna-veerareddy/rust-clippy that referenced this pull request Feb 21, 2020
bors added a commit to rust-lang/rust-clippy that referenced this pull request Feb 21, 2020
bors added a commit that referenced this pull request Feb 21, 2020
Update Clippy from 2855b21 to 8fbb23f

```
Fix ICE in `missing_errors_doc`
Update License
Migrate Clippy to GitHub Actions
redundant_clone: Migrate to new dataflow framework
Move unneeded_field_pattern to pedantic group
Rustup to #69325
Rustup to #69072
```

Fixes #69338

r? @Manishearth
@ecstatic-morse ecstatic-morse deleted the associated-items branch October 6, 2020 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
10 participants