Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hides default fns inside Fuse impl to avoid exposing it to any crate #70910

Merged
merged 3 commits into from
Apr 17, 2020

Conversation

rakshith-ravi
Copy link
Contributor

Fixes #70796

@cuviper I've added some default, private traits to do the job for us. If required, I can expose them to a specific visibility if you want to call these functions for #70332

r? @cuviper

…IteratorImpl to avoid exposing default functions outside of the current crate.
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 8, 2020
@rakshith-ravi
Copy link
Contributor Author

Oh also, I just submitted this to understand if my logic is correct. In case you think it's good, don't approve / merge yet. I'll add a few comments for others stumbling across this in the future and then it can be merged. Didn't want to do that yet in case my implementation logic was wrong

@Dylan-DPC-zz Dylan-DPC-zz changed the title Hides default fns inside Fuse impl to avoid exposing it to any crate [WIP] Hides default fns inside Fuse impl to avoid exposing it to any crate Apr 8, 2020
@Dylan-DPC-zz
Copy link

Added WIP to the title so it doesn't get merged, when you are ready you can remove it

@rakshith-ravi
Copy link
Contributor Author

Cool. Does the WIP mean it won't get reviewed? Cuz I want it to get reviewed before I can make further changes.

Copy link
Member

@cuviper cuviper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. I just have a couple of notes below, but let's go ahead and get a perf run to see how the extra layer affects things. Go ahead and add your additional comments too.

Don't worry about #70332, as we merged #70896 instead which is independent of Fuse.

src/libcore/iter/adapters/fuse.rs Outdated Show resolved Hide resolved
src/libcore/iter/adapters/fuse.rs Outdated Show resolved Hide resolved
@cuviper
Copy link
Member

cuviper commented Apr 9, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Apr 9, 2020

⌛ Trying commit 0a54a94 with merge 99ca7e89f078ee57814d4b9c5575f84f2a2d2329...

@bors
Copy link
Contributor

bors commented Apr 9, 2020

☀️ Try build successful - checks-azure
Build commit: 99ca7e89f078ee57814d4b9c5575f84f2a2d2329 (99ca7e89f078ee57814d4b9c5575f84f2a2d2329)

@rust-timer
Copy link
Collaborator

Queued 99ca7e89f078ee57814d4b9c5575f84f2a2d2329 with parent 93dc97a, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 99ca7e89f078ee57814d4b9c5575f84f2a2d2329, comparison URL.

Removed unnecessarry empty impls.
Moved code to organise it better
@rakshith-ravi rakshith-ravi changed the title [WIP] Hides default fns inside Fuse impl to avoid exposing it to any crate Hides default fns inside Fuse impl to avoid exposing it to any crate Apr 10, 2020
@rakshith-ravi
Copy link
Contributor Author

rakshith-ravi commented Apr 10, 2020

@cuviper a couple of questions:

  • Now that we have removed those empty impls, how do we know for sure that the specialization is being called appropriately and not the default implementation? Is there any way to test that?
  • Since we have an internal trait that hides the implementation of the specialization, will that be optimized away by the optimizer?
    • If it will be optimized away and removed after compile, how do we know that? Is there any way to see what gets optimized away and see what the final binary will look like (other than staring at bare-bone assembly, of course)?
    • If it's not gonna be optimized away, how much of a performance impact is it gonna be? Can we measure that somehow? Or at least estimate the additional CPU cycles it will take? Never mind. Just saw the comparison URL. Not sure if I understand exactly how it works, but I'm trying to figure out what it means.

PS, changes from my side are done. You can merge it away if you think this is good to go

@rakshith-ravi
Copy link
Contributor Author

@cuviper sorry to bother you again, but did you get a chance to review this PR?

@cuviper
Copy link
Member

cuviper commented Apr 14, 2020

  • Now that we have removed those empty impls, how do we know for sure that the specialization is being called appropriately and not the default implementation? Is there any way to test that?

You could make an iterator that counts how many times next() is called, e.g.

struct Foo<'a>(&'a mut usize);
impl Iterator for Foo<'_> {
    type Item = ();
    fn next(&mut self) -> Option<()> {
        *self.0 += 1;
        None
    }
}

fn main() {
    let mut counter = 0;
    let mut foo = Foo(&mut counter).fuse();
    for _ in 0..10 {
        foo.next();
    }
    dbg!(counter);
}

This will print 1, since the default Fuse intercepts further calls. But if we add:

impl std::iter::FusedIterator for Foo<'_> {}

Then it will print 10, as the specialized Fuse forwards every call.

playground

If you'd like to turn that into explicit tests, with and without FusedIterator, that would be great!

  • Since we have an internal trait that hides the implementation of the specialization, will that be optimized away by the optimizer?

It should be trivially inlined, yes, but that is more work for compilation time -- so I'm not surprised that the perf results show a little bit of a slowdown.

If it will be optimized away and removed after compile, how do we know that? Is there any way to see what gets optimized away and see what the final binary will look like (other than staring at bare-bone assembly, of course)?

Well, before assembly there's also LLVM IR, so you could stare at that instead. 🙂 But without even bothering at the instruction level, you could just look at what functions are emitted. If there's no explicit function for the new impl layers, then those must have been inlined.

Beyond that, I think you'd have to have a concrete example to compare assembly before and after. The optimizer can do surprising things -- in the above playground, after optimization it's flattened entirely into main. It doesn't even iterate anything, just goes right into Argument setup for the dbg!, and this is regardless of whether it has FusedIterator! (I did confirm that the unoptimized LLVM IR for Fuse::next was indeed simpler with FusedIterator.)

@cuviper
Copy link
Member

cuviper commented Apr 14, 2020

@Centril, since it was your request to internalize this, how do you feel about the slightly worse perf results?

@Centril
Copy link
Contributor

Centril commented Apr 14, 2020

@cuviper Hmm; do you know where the regressions are coming from (I haven't checked the diff)? They don't look egregious, so imo correctness comes first in this case, but I wonder if we could have our cake and eat it too?

@cuviper
Copy link
Member

cuviper commented Apr 14, 2020

Nothing stands out to me -- but just intuitively, there's an extra layer of function calls that have to be inlined away. Even in debug mode, it's simply a little bit more code to compile.

I wonder if we could have our cake and eat it too?

Well, we had chatted about a new language feature like default(crate) -- that would avoid needing nested calls to hide the specialization impls.

@Centril
Copy link
Contributor

Centril commented Apr 14, 2020

Ah OK. :) default(crate) is probably a long way from being implemented.

My personal view is that, for now, we should accept the slight regression here, merge the PR, and try to recover it when we can do so in a correct way.

@rakshith-ravi
Copy link
Contributor Author

@bors try @rust-timer queue

@bors
Copy link
Contributor

bors commented Apr 15, 2020

@rakshith-ravi: 🔑 Insufficient privileges: not in try users

@rust-timer
Copy link
Collaborator

Insufficient permissions to issue commands to rust-timer.

@rakshith-ravi
Copy link
Contributor Author

Gah!

@cuviper
Copy link
Member

cuviper commented Apr 15, 2020

I don't expect performance will be much different than before, but we can try it.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Apr 15, 2020

⌛ Trying commit abe5973 with merge 26002ac6b8bee06e3c810c453f5771d8b523b1da...

@bors
Copy link
Contributor

bors commented Apr 15, 2020

☀️ Try build successful - checks-azure
Build commit: 26002ac6b8bee06e3c810c453f5771d8b523b1da (26002ac6b8bee06e3c810c453f5771d8b523b1da)

@rust-timer
Copy link
Collaborator

Queued 26002ac6b8bee06e3c810c453f5771d8b523b1da with parent 835428c, future comparison URL.

@Mark-Simulacrum
Copy link
Member

FYI: We just deployed a PR to the perf collector that in theory reduces noise, but may also introduce other unexpected effects. If the results are odd or otherwise unexpected, it may be worth re-running the try build and the perf run. (Indeed, I would likely recommend just doing so regardless, but up to you).

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 26002ac6b8bee06e3c810c453f5771d8b523b1da, comparison URL.

@rakshith-ravi
Copy link
Contributor Author

Wait, the CPU cycles has reduced by 10%?!? And the number of instructions have reduced by 20%?!?

Is this because of the new PR @Mark-Simulacrum?

I mean, I had a hunch that the perf would increase with my new changes, but not this much xD
10% seems like a lot.

@Mark-Simulacrum
Copy link
Member

We've seen improvements elsewhere, so it's most likely spurious, but let's try it again: @bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Apr 16, 2020

⌛ Trying commit abe5973 with merge 6524bc32a6a2b550d6fbbb229227b9f5a3cfd876...

@bors
Copy link
Contributor

bors commented Apr 16, 2020

☀️ Try build successful - checks-azure
Build commit: 6524bc32a6a2b550d6fbbb229227b9f5a3cfd876 (6524bc32a6a2b550d6fbbb229227b9f5a3cfd876)

@rust-timer
Copy link
Collaborator

Queued 6524bc32a6a2b550d6fbbb229227b9f5a3cfd876 with parent 534a41a, future comparison URL.

@cuviper
Copy link
Member

cuviper commented Apr 16, 2020

The new performance results look pretty tightly neutral -- which is a good thing!

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Apr 16, 2020

📌 Commit abe5973 has been approved by cuviper

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 16, 2020
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 17, 2020
Rollup of 7 pull requests

Successful merges:

 - rust-lang#70578 (Add long error explanation for E0657)
 - rust-lang#70910 (Hides default fns inside Fuse impl to avoid exposing it to any crate)
 - rust-lang#71164 (reword Miri validity errors: undefined -> uninitialized)
 - rust-lang#71182 (Add some regression tests)
 - rust-lang#71206 (Miri error messages: avoid try terminology)
 - rust-lang#71220 (Dogfood or_patterns in the standard library)
 - rust-lang#71225 (Fix typo in Default trait docs: Provides -> Provide)

Failed merges:

r? @ghost
@bors bors merged commit d194587 into rust-lang:master Apr 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fuse impls expose use of specialization (default fn)
8 participants