Implement Event Stream message frame ser/de in smithy-eventstream #609

jdisanti · 2021-07-27T00:52:49Z

This implements the Event Stream message frame format for #121.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

rcoh

Overall LGTM. Super clean implementation! Some notes inline. Lets discuss ByteStr and whether it's worth the complexity especially since it becomes part of the public interface

rcoh · 2021-07-27T01:27:30Z

rust-runtime/smithy-eventstream/src/buf/count.rs

+/// A [`Buf`] implementation that counts bytes read.
+pub struct CountBuf<'a> {
+    buffer: &'a mut dyn Buf,
+    count: AtomicUsize,


why does this need to be atomic usize instead of just usize? since advance has &mut self it seems like it could just be a usize?

Originally, I was increasing the count in chunk, which meant I needed interior mutability since chunk takes &self. But then I realized that was a bug since chunk can be called multiple times before advancing, so I fixed that and forgot to reduce the AtomicUsize back down. Will fix.

rcoh · 2021-07-27T14:31:34Z

rust-runtime/smithy-eventstream/src/str_bytes.rs

+/// UTF-8 string byte buffer representation with validation amortization.
+#[non_exhaustive]
+#[derive(Debug)]
+pub struct StrBytes {


I think I would probably simplify this to do what the Http crate does: maintain an invariant that StrBytes can only contain valid UTF-8.

https://github.com/hyperium/http/blob/master/src/byte_str.rs

Assuming this is solely to be used in header names (which we know are always valid utf8?)

Simplified to do validation up front.

rcoh · 2021-07-27T14:34:16Z

rust-runtime/smithy-eventstream/src/frame.rs

+        Int32(i32),
+        Int64(i64),
+        ByteArray(Bytes),
+        String(StrBytes),


if StrBytes pre-validates (as it probably should, then users don't need another layer of checking here)

rcoh · 2021-07-27T14:34:52Z

rust-runtime/smithy-eventstream/src/frame.rs

+                        }
+                        let bytes = buffer.copy_to_bytes(len);
+                        if value_type == TYPE_STRING {
+                            Ok(HeaderValue::String(bytes.into()))


I'd probably actually validate that this is UTF-8 at this point

rcoh · 2021-07-27T14:36:06Z

rust-runtime/smithy-eventstream/src/frame.rs

+            }
+        }
+
+        pub(super) fn write_to(&self, buffer: &mut dyn BufMut) -> Result<(), Error> {


I think I'd probably use impl BufMut here? although I'm generally pro dyn, these methods are going to be extremely hot so getting it to be fully optimized is maybe worth it?

discussion of the tradeoffs here: https://users.rust-lang.org/t/should-i-prefer-impl-impl-mut-impl-or-box-dyn-dyn-mut-dyn-by-default/41778/9

rcoh · 2021-07-27T14:42:32Z

rust-runtime/smithy-eventstream/fuzz/fuzz_targets/roundtrip.rs

+
+fuzz_target!(|message: Message| {
+    let mut buffer = Vec::new();
+    if message.write_to(&mut buffer).is_ok() {


this fails only when the message is too long, right? would be good to have a little bit of sanity checking that the error case only happens when the buffer is too long

rcoh · 2021-07-27T14:44:42Z

rust-runtime/smithy-eventstream/src/frame.rs

+
+    /// Reads a message from the given `buffer`. If the buffer doesn't have the whole
+    /// message in it yet (as is the case when streaming), then it returns `Ok(None)`.
+    pub fn read_from(buffer: &mut dyn Buf) -> Result<Option<Message>, Error> {


I have a slight preference for explicit enums instead of Option here:

enum ParsedMessage { Complete(Message), Incomplete }

naming bikeshedding welcome

That was my instinct, but then I saw Tokio's Decoder use Result<Option<...>>: https://docs.rs/tokio-codec/0.2.0-alpha.5/tokio_codec/trait.Decoder.html#tymethod.decode

I can diverge though if it makes the code clearer.

we can follow tokio, although Tokio also needs special docs to clarify what it means...since None can imply end of stream OR partial message it might be worth a distinction

rcoh · 2021-07-27T14:48:25Z

rust-runtime/smithy-eventstream/src/frame.rs

+
+    #[test]
+    fn message_not_fully_available_yet() {
+        let message = include_bytes!("../test_data/valid_with_all_headers_and_payload");


how are callers supposed to deal with the partially mutated buffer that results from a call to read_from with a partial buffer? It seems like it actually reads through the prelude right? So you couldn't re-use the same bytes later.

would be good to formalize this behavior and document how it should be used in the context of something that is impl Stream<Bytes>

I imagine the caller must have to do something to preserve the initial position in the event it returns None. There's no peeking with Buf, so there is no other way to do that. I should double check this though instead of assuming.

rcoh

LGTM! a few possible improvements inline but this looks awesome.

rcoh · 2021-07-28T19:01:16Z

rust-runtime/smithy-eventstream/fuzz/fuzz_targets/mutated_read.rs

+use smithy_eventstream::frame::{Header, HeaderValue, Message};
+use smithy_types::Instant;
+
+fn mutate(data: &mut [u8], size: usize, max_size: usize) -> usize {


a comment about this fuzz mutation strategy would be good

Added a comment.

rcoh · 2021-07-28T19:02:20Z

rust-runtime/smithy-eventstream/fuzz/fuzz_targets/mutated_read.rs

+    let max_header_len = max_size - non_header_len;
+    let mut headers = (&bytes[12..(12 + headers_len as usize)]).to_vec();
+    headers.resize(max_header_len, 0);
+    let new_header_len =


so do we only mutate the headers? does the crc not apply to the payload? I don't really follow what's going on here overall I guess

There's not a lot to test with the payload itself except that the message length and header length were accurate. I think there should probably be one more fuzz test that just plays with the prelude while keeping the CRCs valid to explore that validation logic.

rcoh · 2021-07-28T19:07:45Z

rust-runtime/smithy-eventstream/src/str_bytes.rs

-    }
-}
-
-/// UTF-8 string byte buffer representation with validation amortization.
 #[non_exhaustive]
 #[derive(Debug)]
 pub struct StrBytes {


lost the doc comment on this structure. A usage example might be helpful

rcoh · 2021-07-28T19:08:15Z

rust-runtime/smithy-eventstream/src/str_bytes.rs

@@ -88,7 +45,6 @@ impl Clone for StrBytes {
    fn clone(&self) -> Self {


this can be derived

Good catch! Fixed.

rcoh · 2021-07-28T19:10:30Z

rust-runtime/smithy-eventstream/src/str_bytes.rs

-        StrBytes {
-            bytes: Bytes::from(value),
-            valid_utf8: AtomicU8::new(State::Unknown.as_u8()),
+impl TryFrom<&'static [u8]> for StrBytes {


usually I've seen this as an explicit from_static inherent method because if we ever want to add a TryFrom<&'a [u8]> these would conflict

I'm following bytes's lead here. Added try_copy_from_slice and copy_from_str functions for the non-static use-case.

rcoh · 2021-07-28T19:12:10Z

rust-runtime/smithy-eventstream/src/str_bytes.rs

+impl TryFrom<Bytes> for StrBytes {
+    type Error = Utf8Error;
+
+    fn try_from(bytes: Bytes) -> Result<Self, Self::Error> {


can you delegate these to all go through the same place? Either TryFrom<Bytes> or a new function?

That way we can audit that there is only one location that actually instantiates StrBytes { bytes } with the private constructor

rcoh · 2021-07-28T19:17:17Z

rust-runtime/smithy-eventstream/src/frame.rs

+        let mut decoder = MessageFrameDecoder::new();
+        let mut segmented = SegmentedBuf::new();
+        let mut decoded = Vec::new();
+        for window in repeated.chunks(3) {


slight preference for one byte at a time to make sure that it works with only a single byte

could also just repeat this test with a range of window sizes

rcoh · 2021-07-28T19:22:40Z

rust-runtime/smithy-eventstream/src/frame.rs

+    /// Returns `Ok(None)` if there's not enough data, or `Some(remaining)` where
+    /// `remaining` is the number of bytes after the prelude that belong to the
+    /// message that's in the buffer.
+    fn remaining_ready<B: Buf>(&self, buffer: &B) -> Result<Option<usize>, Error> {


found this function name a little hard to grok. maybe try_next_end_of_frame? don't really love that either

Yeah, this one was really hard to come up with a good name for. I'll go for remaining_bytes_if_frame_available. It's slightly better at least:

if let Some(remaining_len) = self.remaining_bytes_if_frame_available(&buffer)? {

rcoh · 2021-07-28T19:24:43Z

rust-runtime/smithy-eventstream/fuzz/fuzz_targets/mutated_headers.rs

-        } else {
-            result.unwrap().unwrap()
-        }
+    let message = if let Ok(message) = Message::read_from(input) {


slight preference for using the FrameDecoder api in all the fuzz tests mostly just for the coverage. Could add a #[cfg(test)] helper to make it easy to use?

I think this is adequately covered in the unit tests since there's only one u32 worth of data that is actually relevant to the MessageFrameDecoder. Fuzzing that u32 wouldn't give us much benefit as the result is either Complete or Incomplete without a whole lot of logic in between.

yeah that's fair. Maybe more relevant for the round trip test.

jdisanti added 2 commits July 26, 2021 17:49

Implement Event Stream message frame ser/de in smithy-eventstream

19d971d

Add roundtrip fuzz target to smithy-eventstream

e997267

jdisanti requested a review from rcoh July 27, 2021 00:52

rcoh reviewed Jul 27, 2021

View reviewed changes

jdisanti added 5 commits July 27, 2021 15:13

Add more fuzz testing targets to smithy-eventstream

d5d22bd

Incorporate feedback

9a5f591

Rename fuzz tests

767c8fd

Refactor streaming use-case into MessageFrameDecoder

e5a3106

Merge remote-tracking branch 'upstream/main' into event-stream-framing

afb2954

jdisanti requested a review from rcoh July 28, 2021 01:44

rcoh approved these changes Jul 28, 2021

View reviewed changes

jdisanti added 3 commits July 28, 2021 13:42

Add fuzz target for fuzzing the message prelude

0d5b3ac

Incorporate feedback

f63818f

Merge remote-tracking branch 'upstream/main' into event-stream-framing

84ce7fd

jdisanti merged commit 9fef09a into smithy-lang:main Jul 29, 2021

jdisanti deleted the event-stream-framing branch August 6, 2021 20:02

jdisanti mentioned this pull request Aug 7, 2021

Event stream protocol implementation #121

Open

27 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Event Stream message frame ser/de in smithy-eventstream #609

Implement Event Stream message frame ser/de in smithy-eventstream #609

jdisanti commented Jul 27, 2021

rcoh left a comment

rcoh Jul 27, 2021

jdisanti Jul 27, 2021

rcoh Jul 27, 2021

jdisanti Jul 27, 2021

rcoh Jul 27, 2021

rcoh Jul 27, 2021

rcoh Jul 27, 2021

rcoh Jul 27, 2021

rcoh Jul 27, 2021

rcoh Jul 27, 2021

jdisanti Jul 27, 2021 •

edited

Loading

rcoh Jul 27, 2021

rcoh Jul 27, 2021

jdisanti Jul 27, 2021

rcoh left a comment

rcoh Jul 28, 2021

jdisanti Jul 28, 2021

rcoh Jul 28, 2021

jdisanti Jul 28, 2021

rcoh Jul 28, 2021

rcoh Jul 28, 2021

jdisanti Jul 28, 2021

rcoh Jul 28, 2021

jdisanti Jul 28, 2021

rcoh Jul 28, 2021

rcoh Jul 28, 2021

rcoh Jul 28, 2021

rcoh Jul 28, 2021

jdisanti Jul 28, 2021

rcoh Jul 28, 2021

jdisanti Jul 28, 2021

rcoh Jul 28, 2021

		@@ -88,7 +45,6 @@ impl Clone for StrBytes {
		fn clone(&self) -> Self {

Implement Event Stream message frame ser/de in smithy-eventstream #609

Implement Event Stream message frame ser/de in smithy-eventstream #609

Conversation

jdisanti commented Jul 27, 2021

rcoh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jdisanti Jul 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rcoh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jdisanti Jul 27, 2021 •

edited

Loading