Add `Iterator::array_chunks()` #92393

rossmacarthur · 2021-12-29T08:33:27Z

This has been similarly implemented as .tuples() in itertools as Itertools::tuples(). But it makes more sense with arrays since all elements are the same type.

I will update stability attributes with a tracking issue if accepted.

See also #92394

rust-highfive · 2021-12-29T08:33:30Z

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

library/core/src/iter/traits/iterator.rs

library/core/src/iter/adapters/array_chunks.rs

the8472 · 2021-12-29T17:35:29Z

See #87776 for a previous PR attempting that, some of the discussion might be relevant.

rossmacarthur · 2021-12-30T09:26:05Z

@the8472 Thanks, I will check it out. Compared the benchmarks and that implementation is definitely a lot more performant 😅. I will look into improving this one using some ideas from that one.

rossmacarthur · 2022-01-02T16:47:05Z

Okay so I have basically taken #87776 and improved it, most notably the following:

Handle remainder correctly in DoubleEndedIterator implementation by requiring ExactSizeIterator implementation to remove the correct amount elements. It behaves like this now which I believe should be the correct behavior.
```
let mut it = (0..5).array_chunks();
assert_eq!(it.next_back(), Some([2, 3])):
assert_eq!(it.next_back(), Some([0, 1])):
assert_eq!(it.remainder(), &[5)):
```
Fill array in reverse in the DoubleEndedIterator implementation, this improves performance over the previous PR.

Using the benchmark functions the performance is now the following

test iter::bench_iter_array_chunks_fold                         ... bench:     749,479 ns/iter (+/- 91,429)
test iter::bench_iter_array_chunks_loop                         ... bench:     751,425 ns/iter (+/- 74,197)
test iter::bench_iter_array_chunks_rev_loop                     ... bench:     751,228 ns/iter (+/- 74,905)
test iter::bench_iter_array_chunks_rfold                        ... bench:     697,941 ns/iter (+/- 61,837)

Compared to previously

test iter::bench_iter_array_chunks_fold                         ... bench:     688,546 ns/iter (+/- 50,318)
test iter::bench_iter_array_chunks_loop                         ... bench:   1,039,287 ns/iter (+/- 40,049)
test iter::bench_iter_array_chunks_rev_loop                     ... bench:   1,202,417 ns/iter (+/- 79,646)
test iter::bench_iter_array_chunks_rfold                        ... bench:     850,933 ns/iter (+/- 20,082)

fbstj · 2022-01-03T19:11:28Z

library/core/src/iter/adapters/array_chunks.rs

+
+    /// Returns a reference to the remaining elements of the original iterator
+    /// that are not going to be returned by this iterator. The returned slice
+    /// has at most `N-1` elements.


does this need to mention that it is only useful after next() is None ?

Edit: Nevermind, my comment was nonsense, I forgot that this doesn't have access to the underlying source.

does this need to mention that it is only useful after next() is None ?

Yeah, this is a good point. I was thinking it might also be useful for this to return an Option so that it easy to tell the difference between an empty remainder and a not yet known remainder. What do you think?

I'd probably suggest a single fn into_remainder(self) -> Option<array::IntoIter> method, and that'll also let us get rid of the custom remainder struct entirely I think? (We can directly store the Option<array::IntoIter> as the second field).

library/core/src/iter/traits/iterator.rs

Mark-Simulacrum · 2022-01-10T20:52:19Z

library/core/src/iter/adapters/array_chunks.rs

+
+    /// Returns a reference to the remaining elements of the original iterator
+    /// that are not going to be returned by this iterator. The returned slice
+    /// has at most `N-1` elements.


I'd probably suggest a single fn into_remainder(self) -> Option<array::IntoIter> method, and that'll also let us get rid of the custom remainder struct entirely I think? (We can directly store the Option<array::IntoIter> as the second field).

Mark-Simulacrum · 2022-01-10T21:21:34Z

library/core/src/iter/adapters/array_chunks.rs

+struct FrontGuard<T, const N: usize> {
+    /// A pointer to the array that is being filled. We need to use a raw
+    /// pointer here because of the lifetime issues in the fold implementations.
+    ptr: *mut T,


It seems like we could store the array directly inside the guard?

Mark-Simulacrum · 2022-01-10T21:50:02Z

library/core/src/iter/adapters/array_chunks.rs

+}
+
+#[unstable(feature = "iter_array_chunks", reason = "recently added", issue = "none")]
+impl<I, const N: usize> DoubleEndedIterator for ArrayChunks<I, N>


I think it may make sense to not have double ended iterators? It feels a little unclear that the semantics of cutting tail elements are the ones users will expect. Do we have a use case for backwards chunk iteration?

It seems plausible that we could expose a dedicated .skip_remainder() -> ArrayChunksRev<I, N> or similar, perhaps, to more explicitly skip it...

I do think because .rev().array_chunks() is different to .array_chunks().rev() it is worth having. It is also worth noting that the following APIs have a DoubleEndedIterator implementation that behave in this way and cut the tail elements.

slice::chunks_exact()

slice::array_chunks()

However I do realize since iterators are often used in a consuming way this might be strange. Also it does require ExactSizeIterator bound. Additionally, I haven't personally found a use case for this so I am happy to remove this implementation.

scottmcm · 2022-01-31T20:32:08Z

library/core/src/iter/adapters/array_chunks.rs

+        // SAFETY: `array` will still be valid if `guard` is dropped.
+        let mut guard = unsafe { FrontGuard::new(&mut array) };
+
+        for slot in array.iter_mut() {


suggestion: for any next method where there's a loop, consider just using self.try_for_each(ControlFlow::Break).break_value() instead. That's simpler code, less unsafe, gives better coverage of your try_fold override, and should be just as fast -- or even better for inner iterators that have their own try_fold overrides.

scottmcm · 2022-01-31T20:37:55Z

library/core/src/iter/adapters/array_chunks.rs

+        let (lower, upper) = self.iter.size_hint();
+        // Keep infinite iterator size hint lower bound as `usize::MAX`. This
+        // is required to implement `TrustedLen`.
+        if lower == usize::MAX {


This is incorrect. The lower bound can be usize::MAX without being infinite -- take .repeat(x).take(usize::MAX), for example, or just [(); usize::MAX].into_iter(). Nor is (usize::MAX, None) certain to be infinite -- especially on 16-bit platforms.

Adapters that make things shorter can propagate ExactSizeIterator, but not TrustedLen. (And, conversely but not relevant here, adapters that makes things longer, like Chain, can propagate TrustedLen but not ExactSizeIterator.)

Okay thanks, so you are saying we can't implement TrustedLen for this at all because it makes the iterator shorter? I think I got a little confused by this explanation in the TrustedLen documentation. It might need to be extended a little.

/// The iterator reports a size hint where it is either exact /// (lower bound is equal to upper bound), or the upper bound is [`None`]. /// The upper bound must only be [`None`] if the actual iterator length is /// larger than [`usize::MAX`]. In that case, the lower bound must be /// [`usize::MAX`], resulting in an [`Iterator::size_hint()`] of /// `(usize::MAX, None)`. /// /// The iterator must produce exactly the number of elements it reported /// or diverge before reaching the end.

Okay thanks, so you are saying we can't implement TrustedLen for this at all because it makes the iterator shorter?

That's correct.

For a specific example, if the actual length is usize::MAX + 3 or usize::MAX + 5, the size_hint will be (usize::MAX, None).

But array_chunks::<2> would need to return a length of exactly usize::MAX/2 + 2 or usize::MAX/2 + 3 respectively for them, which it can't, so thus it can't be TrustedLen.

(Similar logic explains why https://doc.rust-lang.org/std/iter/struct.Skip.html doesn't implement TrustedLen.)

scottmcm · 2022-01-31T20:47:31Z

library/core/src/iter/adapters/array_chunks.rs

+        // SAFETY: `array` will still be valid if `guard` is dropped.
+        let mut guard = unsafe { FrontGuard::new(&mut array) };
+
+        let result = self.iter.try_fold(init, |mut acc, item| {


This is really useful functionality, so consider whether there's a way to expose it on Iterator directly without the adapter, and then make the adapter really simple. (Vague analogy: https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.array_chunks is a simple wrapper around https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.as_chunks )

For example, imagine you had

pub trait Iterator { fn next_chunk<const N: usize>(&mut self) -> Result<[T; N], array::IntoIter<T, N>>; }

That's usable independently, overridable for array and slice iterators that can do it more efficiently, and would be exactly the piece you need to make the ArrayChunks iterator easy to write in safe code.

(It'd be nice if it were -> Result<[T; N], array::IntoIter<T, N-1>>, but that's probably not writable right now.)

That could be useful to manually unroll an iterator. E.g. we know that a vec goes from 0 to 8 capacity with its default allocation behavior (for small T) and then grows in multiples from there. So we could try unrolling the loop in batches of N <= 8. But this might still be difficult to optimize for Chain adapters if a middle part straddles the chain boundary.

A try_fold_chunked might be easier to optimize because it would split the loop into 3 parts. Going from try_fold_chunked to chunked_next is easier than the other way around.

scottmcm · 2022-01-31T20:52:41Z

library/core/src/iter/adapters/array_chunks.rs

+
+        // SAFETY: The array will still be valid if `guard` is dropped and
+        // it is forgotten otherwise.
+        let mut guard = unsafe { FrontGuard::new(&mut array) };


There's lots of unsafe here, as has often been the case in PRs for new methods using const generic arrays.

Please re-use the stuff from core::array for this (see

rust/library/core/src/array/mod.rs

Line 792 in 498eeb7

fn try_collect_into_array<I, T, R, const N: usize>(iter: &mut I) -> Option<R::TryType>

for example). That might mean adding a new crate-local or unstable method to the array module, or something, but that's better than making more versions of these Guard types in more places.

Agree it makes sense to have this stuff in core::array 👍

JohnCSimon · 2022-03-06T03:47:05Z

ping from triage:
@rossmacarthur Is this PR ready for review or are you still working in it?

FYI: when a PR is ready for review, send a message containing
@rustbot ready to switch to S-waiting-on-review so the PR appears in the reviewer's backlog.

bors · 2022-03-10T05:21:21Z

☔ The latest upstream changes (presumably #94787) made this pull request unmergeable. Please resolve the merge conflicts.

Dylan-DPC · 2022-04-08T00:43:46Z

closing this as inactive

…-ou-se Add `Iterator::next_chunk` See also rust-lang#92393 ### Prior art - [`Itertools::next_tuple()`](https://docs.rs/itertools/latest/itertools/trait.Itertools.html#method.next_tuple) ### Unresolved questions - Should we also add `next_chunk_back` to `DoubleEndedIterator`? - Should we rather call this `next_array()` or `next_array_chunk`?

Add `Iterator::array_chunks` (take N+1) A revival of rust-lang#92393. r? `@Mark-Simulacrum` cc `@rossmacarthur` `@scottmcm` `@the8472` I've tried to address most of the review comments on the previous attempt. The only thing I didn't address is `try_fold` implementation, I've left the "custom" one for now, not sure what exactly should it use.

Add `Iterator::next_chunk` See also rust-lang/rust#92393 ### Prior art - [`Itertools::next_tuple()`](https://docs.rs/itertools/latest/itertools/trait.Itertools.html#method.next_tuple) ### Unresolved questions - Should we also add `next_chunk_back` to `DoubleEndedIterator`? - Should we rather call this `next_array()` or `next_array_chunk`?

Add `Iterator::array_chunks` (take N+1) A revival of rust-lang/rust#92393. r? `@Mark-Simulacrum` cc `@rossmacarthur` `@scottmcm` `@the8472` I've tried to address most of the review comments on the previous attempt. The only thing I didn't address is `try_fold` implementation, I've left the "custom" one for now, not sure what exactly should it use.

rust-highfive assigned Mark-Simulacrum Dec 29, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 29, 2021

rossmacarthur mentioned this pull request Dec 29, 2021

Add Iterator::array_windows() #92394

Closed

mominul reviewed Dec 29, 2021

View reviewed changes

library/core/src/iter/traits/iterator.rs Show resolved Hide resolved

lukaslueg reviewed Dec 29, 2021

View reviewed changes

library/core/src/iter/adapters/array_chunks.rs Outdated Show resolved Hide resolved

rossmacarthur force-pushed the ft/array-chunks branch from 002a875 to 66d8831 Compare December 29, 2021 16:11

rossmacarthur force-pushed the ft/array-chunks branch from 66d8831 to 3ceb0cd Compare January 2, 2022 16:34

fbstj reviewed Jan 3, 2022

View reviewed changes

Mark-Simulacrum reviewed Jan 10, 2022

View reviewed changes

Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 10, 2022

scottmcm reviewed Jan 31, 2022

View reviewed changes

rossmacarthur added 2 commits February 4, 2022 16:01

Add Iterator::array_chunks()

c115d5a

Use array::IntoIter for the ArrayChunks remainder

6adcad8

rossmacarthur force-pushed the ft/array-chunks branch from 3ceb0cd to 6adcad8 Compare February 4, 2022 14:02

rossmacarthur mentioned this pull request Feb 6, 2022

Add Iterator::next_chunk #93700

Merged

SkiFire13 mentioned this pull request Feb 23, 2022

[ER] Iterator::array_chunks_copy ? #94257

Closed

Dylan-DPC closed this Apr 8, 2022

Dylan-DPC removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Apr 8, 2022

Dylan-DPC added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Apr 8, 2022

the8472 mentioned this pull request Jun 25, 2022

Tracking Issue for Iterator::next_chunk #98326

Open

3 tasks

WaffleLapkin mentioned this pull request Aug 1, 2022

Add Iterator::array_chunks (take N+1) #100026

Merged

rossmacarthur deleted the ft/array-chunks branch November 27, 2022 07:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `Iterator::array_chunks()` #92393

Add `Iterator::array_chunks()` #92393

rossmacarthur commented Dec 29, 2021 •

edited

Loading

rust-highfive commented Dec 29, 2021

the8472 commented Dec 29, 2021

rossmacarthur commented Dec 30, 2021

rossmacarthur commented Jan 2, 2022 •

edited

Loading

fbstj Jan 3, 2022

the8472 Jan 3, 2022 •

edited

Loading

rossmacarthur Jan 4, 2022 •

edited

Loading

Mark-Simulacrum Jan 10, 2022

Mark-Simulacrum Jan 10, 2022

Mark-Simulacrum Jan 10, 2022

Mark-Simulacrum Jan 10, 2022

rossmacarthur Jan 15, 2022

scottmcm Jan 31, 2022

scottmcm Jan 31, 2022

rossmacarthur Feb 4, 2022

scottmcm Feb 4, 2022

scottmcm Jan 31, 2022

the8472 Jan 31, 2022 •

edited

Loading

scottmcm Jan 31, 2022

rossmacarthur Feb 4, 2022

JohnCSimon commented Mar 6, 2022

bors commented Mar 10, 2022

Dylan-DPC commented Apr 8, 2022

Add Iterator::array_chunks() #92393

Add Iterator::array_chunks() #92393

Conversation

rossmacarthur commented Dec 29, 2021 • edited Loading

rust-highfive commented Dec 29, 2021

the8472 commented Dec 29, 2021

rossmacarthur commented Dec 30, 2021

rossmacarthur commented Jan 2, 2022 • edited Loading

Choose a reason for hiding this comment

the8472 Jan 3, 2022 • edited Loading

Choose a reason for hiding this comment

rossmacarthur Jan 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

the8472 Jan 31, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JohnCSimon commented Mar 6, 2022

bors commented Mar 10, 2022

Dylan-DPC commented Apr 8, 2022

Add `Iterator::array_chunks()` #92393

Add `Iterator::array_chunks()` #92393

rossmacarthur commented Dec 29, 2021 •

edited

Loading

rossmacarthur commented Jan 2, 2022 •

edited

Loading

the8472 Jan 3, 2022 •

edited

Loading

rossmacarthur Jan 4, 2022 •

edited

Loading

the8472 Jan 31, 2022 •

edited

Loading