Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a faster way to read large Vec<u8>? #462

Open
MeguminSama opened this issue Jul 31, 2024 · 5 comments
Open

Is there a faster way to read large Vec<u8>? #462

MeguminSama opened this issue Jul 31, 2024 · 5 comments

Comments

@MeguminSama
Copy link

MeguminSama commented Jul 31, 2024

At the moment, say I have a struct like this:

#[derive(Debug, DekuRead, DekuWrite)]
#[deku(ctx = "size: usize", ctx_default = "0")]
pub struct Rfc {
	#[deku(bytes_read = "size")]
	pub data: Vec<u8>,
}

The problem, is that deku seems to loop through the reader for each u8 in bytes_read. This causes it to be very slow on large vectors. #[deku(read_all)] and #[deku(count = "size")] are also very slow.

At the moment, we're using our own read function to do something like this:

match reader.read_bytes(size, &mut buf) {
	Ok(ReaderRet::Bytes) => Ok(Rfc { data: buf }),
	_ => {...}
}

Which is significantly faster.

But I was wondering if there was a built-in way to do this with deku, instead of deku looping over each u8?

If this isn't a feature currently, I might consider implementing it if it's something you'd want in deku.

Thanks!

@wcampbell0x2a
Copy link
Collaborator

For read_all performance, check out the following MR. #441

Since deku makes small repeated reads, using a https://doc.rust-lang.org/std/io/struct.BufReader.html should reduce the read overhead.

@MeguminSama
Copy link
Author

Thanks for getting back so quickly!

At the moment, our reader is already using a BufReader. We tried doing this in an attempt to speed it up, but unfortunately it's still much too slow when reading the vectors compared to our own read function.

Would some kind of #[deku(read_buffer = "size")] attribute be something you'd consider? Or is this out of scope for deku?

@wcampbell0x2a
Copy link
Collaborator

Definitely try out the merge request, it's really slow without that.

Would some kind of #[deku(read_buffer = "size")] attribute be something you'd consider? Or is this out of scope for deku?

Sure, I don't have the code in front of me, but I think we only store leftover as a u8, so we would need to store the leftovers in a Vec<u8> if needed. I'd like to use that only if you use read_buffer, since for embedded platforms you don't want allocations all the time.

@wcampbell0x2a
Copy link
Collaborator

I also don't know, it could be an improvement in our impl of Vec, I think it reads and evaluates one at a time currently.

@MeguminSama
Copy link
Author

I will take a look, thanks :)

wcampbell0x2a added a commit that referenced this issue Sep 18, 2024
* Add read_exact, which can only be used for Vec<u8> but allows faster reading
  that doesn't have ctx or limits.

See #462
wcampbell0x2a added a commit that referenced this issue Sep 18, 2024
* Add read_exact, which can only be used for Vec<u8> but allows faster reading
  that doesn't have ctx or limits.

See #462
wcampbell0x2a added a commit that referenced this issue Sep 18, 2024
* Add read_exact, which can only be used for Vec<u8> but allows faster reading
  that doesn't have ctx or limits.

See #462
wcampbell0x2a added a commit that referenced this issue Sep 18, 2024
* Add read_exact, which can only be used for Vec<u8> but allows faster reading
  that doesn't have ctx or limits.

See #462
wcampbell0x2a added a commit that referenced this issue Sep 20, 2024
* For Vec<u8> when using count, specialize into reading the bytes all
  at once

See #462
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants