-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make BufferedChannelIterator#hasNext
idempotent
#4065
Conversation
The rationale in favor of the change includes: * in the `1.6.x` series `Channel().iterator().hasNext()` was idempotent * despite the lack of requirement for the `Iterator#hasNext` method implementations to be idempotent, it is generally safer to make them such. The same applies to `ChannelIterator#hasNext` Here is an example where it matters. Consider the following code snippet: ``` val chunk = mutableListOf<Int>() while(iter.hasNext()) { val v = iter.next() chunk.add(v) if (chunk.size >= MAX_BATCH_SIZE || !iter.hasNext()) { yield(chunk.toList()) chunk.clear() } } ``` We address two concerns with one block of code - we run the if-block on reaching the chunk size limit or on the last chunk. If `#hasNext` isn't idempotent, I see a couple of options. The first solution is, the code would essentially have to execute the content of the if-block in two places - in the loop and after the loop: ``` val chunk = mutableListOf<Int>() while(iter.hasNext()) { val v = iter.next() chunk.add(v) if (chunk.size >= MAX_BATCH_SIZE) { yield(chunk.toList()) chunk.clear() } } if (chunk.isNotEmpty) { yield(chunk.toList()) } ``` Alternatively, we can do memoization of `#hasNext` ourselves e.g.: ``` val chunk = mutableListOf<Int>() var hasNextMemo = iter.hasNext() while(hasNextMemo) { val v = iter.next() chunk.add(v) hasNextMemo = iter.hasNext() if (chunk.size >= MAX_BATCH_SIZE || !hasNextMemo) { yield(chunk.toList()) chunk.clear() } } ``` Either way, the lack of idempotency of `#hasNext` spills the concern of knowing the state of the iterator over into the caller's code.
Yep, that's the best one. This way, you aren't wasting time on checking "is this the end already?" on every iteration, which even looks awkward. Also, this version can be adapted to the more idiomatic import kotlinx.coroutines.channels.Channel
import kotlinx.coroutines.flow.flow
const val MAX_BATCH_SIZE = 100
suspend fun main() {
val channel = Channel<Int>()
val chunk = mutableListOf<Int>()
flow {
for (v in channel) {
chunk.add(v)
if (chunk.size >= MAX_BATCH_SIZE) {
emit(chunk.toList())
chunk.clear()
}
}
if (chunk.isNotEmpty()) {
emit(chunk.toList())
}
}
} |
@dkhalanskyjb In my opinion either solution is not great because we are addressing the concern that the iterator could have easily addressed. I guess in case one writes some generic code that never knows what iterator it receives, one should guard against the possibility of an iterator not having I think the code that uses |
kotlinx-coroutines-core/common/test/channels/BufferedChannelTest.kt
Outdated
Show resolved
Hide resolved
Thanks! |
The rationale in favor of the change includes: * in the `1.6.x` series `Channel().iterator().hasNext()` was idempotent * despite the lack of requirement for the `Iterator#hasNext` method implementations to be idempotent, it is generally safer to make them such. The same applies to `ChannelIterator#hasNext`
The rationale in favor of the change includes:
1.6.x
seriesChannel().iterator().hasNext()
was idempotentIterator#hasNext
method implementations to be idempotent, it is generally safer to make them such. The same applies toChannelIterator#hasNext
Here is an example where it matters. Consider the following code snippet:
We address two concerns with one block of code - we run the if-block on reaching the chunk size limit or on the last chunk.
If
#hasNext
isn't idempotent, I see a couple of options. The first solution is, the code would essentially have to execute the content of the if-block in two places - in the loop and after the loop:Alternatively, we can do memoization of
#hasNext
ourselves e.g.:Either way, the lack of idempotency of
#hasNext
spills the concern of knowing the state of the iterator over into the caller's code.