Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbounded memory allocations #34

Open
Shnatsel opened this issue Sep 13, 2018 · 13 comments
Open

Unbounded memory allocations #34

Shnatsel opened this issue Sep 13, 2018 · 13 comments

Comments

@Shnatsel
Copy link
Contributor

When given a crafted input lewton attempts to allocate enormous amounts of memory. This presents several issues:

  1. This has DoS potential: opening several ogg files for decoding at once may exhaust the memory address space and cause a crash
  2. This prevents any binary depending on lewton from being analyzed with LLVM sanitizers which do not support binaries allocating terabytes of memory

Most decoding libraries face this issue at some point. This is usually solved by limiting the amount of allocated memory to some sane default, and letting people override it if they're really dealing with enormous amounts of data. In Rust we can easily allow the API user to override these limits via the builder pattern.

See https://libpng.sourceforge.io/decompression_bombs.html for more info on how a similar issue was solved in libpng. See also the Limits struct from flif crate.

We have briefly discussed this in #32, but I'm now filing this is as a separate issue because it is now a blocker for me. I'm seeing intermittent crashes under fuzzer that I cannot reproduce, and the inability to use sanitizers prevents me from pinpointing them.

@est31
Copy link
Member

est31 commented Sep 14, 2018

I'm critical about this, for two reasons I think. If you disagree, please feel free to provide some insights into how you think the two issues should be addressed.

Rust and it's "convenient" heap data structures

If you look at C, you are required to calculate sizes of anything you allocate manually, and then pass that to a malloc call. If you want to limit the amount of allocated stuff, you just replace the malloc call with some call that checks whether the amount of allocated stuff is enough.

Rust offers very convenient wrappers around malloc. It calculates the sizes for you! The proposed way to address this issue is to re-do the calculation and to then use those Rust provided wrappers, only if the calculation resulted in an okay amount of data. That's a bad idea IMO because it means that I need to re-do all the stuff that Rust is supposed to do for me already. I'd be thrown back to the convenience levels of C!

I think that a good way to fix this issue is to have the ability to set custom, failible allocators. Then I could just set any limit I want. I wonder whether there are any crates that have support either for custom and failible allocators, or for memory limits.

Another problem is that allocation is just so easy to invoke... basically it can be as close as a collect() or an into() call. It would be really great if there'd be a lint or something that you can use that highlights the spots where allocation can happen.

The challenge of "sane default"

limiting the amount of allocated memory to some sane default,

I'm not really sure whether lewton is a good place for choosing such values. I mean, just look at the simplest example I can think of: vorbis metadata. They are encoded in the format where first a 32 bit number n is encoded for the length, then the next n bytes contain the strings. Which should be a sane maximum length for the string? I could of course say, okay 640k oughta be enough, but maybe someone has put something really really long into the metadata? IDK.

@Shnatsel
Copy link
Contributor Author

I think that a good way to fix this issue is to have the ability to set custom, failible allocators. Then I could just set any limit I want. I wonder whether there are any crates that have support either for custom and failible allocators, or for memory limits.

Sadly, memory allocators are global for the compiled binary, so the limit would also be global. Even if lewton is prepared to deal with failing allocations, client code cannot be expected to be able to recover from them, so the binary would crash on a failed allocation anyway. I don't think it is a reasonable solution.

If you look at C, you are required to calculate sizes of anything you allocate manually, and then pass that to a malloc call. If you want to limit the amount of allocated stuff, you just replace the malloc call with some call that checks whether the amount of allocated stuff is enough.

This is not how C libraries actually operate. They do not impose limits on memory specifically because that tends to be very confusing for the users - it gets very hard to answer the question "will this file decode with default limits or not?".

For example, libpng limits the image size in pixels to 1,000,000 by 1,000,000 by default, or roughly 2^40 pixels. This results in most real-world images still decoding fine, but the peak memory allocation is limited to 2^44 bytes per image in the worst case. That's a lot, but not anywhere near enough to exhaust 64-bit address space. If the API user expects to deal with images that large - which, by the way, would require 16 Terabytes of RAM and therefore completely impractical for years to come - they can override the default limit.

I'm not really sure whether lewton is a good place for choosing such values. I mean, just look at the simplest example I can think of: vorbis metadata. They are encoded in the format where first a 32 bit number n is encoded for the length, then the next n bytes contain the strings. Which should be a sane maximum length for the string? I could of course say, okay 640k oughta be enough, but maybe someone has put something really really long into the metadata? IDK.

I'd actually allow full-length metadata (32 bits means 2Gb per chunk) and limit the amount of chunks to 16384 or some such. This gives 2^46 maximum allocation size. Or even lower amount of chunks, because recording 16,000 of those in a single file does not make sense in practice and if you've put that many of them in a single file you're probably trying to DoS something. And if the user has a really weird use case where they need to process such files, they can simply raise the limit. Builder pattern in Rust makes that a breeze.

But if you don't want to limit the amount of data that can be decoded in any way, but also don't want to allow allocating terabytes of RAM up front, how about preallocating RAM up to a certain sane default, and only allocating more if that much data gets actually written? This solves the problem of address space exhaustion with crafted files, it still allows decoding huge files if that's what you've actually got (well, probably not terabytes), and it's rusty - Rust's vectors provide a very convenient way to do exactly that.

@est31
Copy link
Member

est31 commented Dec 16, 2018

Recently, @raphlinus has given an interesting talk about a bunch of things, including low-latency audio programming. He mentioned the idea that in order to check that code indeed avoids allocations, one could create a global allocator that panics/aborts if there is an allocation happening. Something like this (using the global allocator) could be used by lewton as well to ensure that there are no/few allocations. There won't be a static guarantee, but there'll be a dynamic one.

@est31
Copy link
Member

est31 commented Mar 4, 2019

I wrote a crate and pushed a new branch that uses it to address #43. I think this is the way forward.

@est31
Copy link
Member

est31 commented Mar 4, 2019

I'm a bit reluctant though as:

  • the change would break the API or alternatively I'd have to copy the public API and add _bounded variants to each function
  • balloc might need more changes/improvements

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Mar 7, 2019

As long as balloc API is not re-exported or otherwise exposed, it should be okay to introduce if an API change is planned anyway for #40

I understand the limiting is currently only done for comments, so the scope doesn't overlap with #40?

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Mar 7, 2019

Also, do I understand correctly that Vorbis comments for which balloc is currently used are not compressed in any way? In that case you only have to limit the amount of memory you're willing to preallocate without actually writing to it, and just let vectors automatically increase capacity to accommodate the data once real writes start happening. This way you'd need at least a 512Mb file to allocate 1Gb of memory (assuming the preallocation threshold is under 1Gb). Best of all, this would avoid the API break altogether.

@est31
Copy link
Member

est31 commented Mar 7, 2019

In order to become allocation-free, an API change would be needed yeah, but my goal with #40 is only to reduce the allocations as much as possible.

I understand the limiting is currently only done for comments, so the scope doesn't overlap with #40?

Yes, the limiting is currently only done for the comments. I'm not done with it. There is minor scope overlap as allocations should either be removed or ported to balloc. The idea of balloc is that only a few allocations are happening, so that the extra checking only has negligible performance impact.

In that case you only have to limit the amount of memory you're willing to preallocate without actually writing to it, and just let vectors automatically increase capacity to accommodate the data once real writes start happening. This way you'd need at least a 512Mb file to allocate 1Gb of memory (assuming the preallocation threshold is under 1Gb). Best of all, this would avoid the API break altogether.

You mean something like if comment_length > packet.len() { return Err(); }? I've considered doing that. It would be an okay fix for #43. But it's easy to miss a check and hard to make a check that is really watertight. We are back to the C experience where you have to "watch out" that you don't do any mistakes, both during writing the code, and during changing it down the line. Balloc is designed in a way that the checking is enforced by the typesystem. Also, if you are not interested in checking at all, balloc has a mode where you can turn it off during compile time, removing any overhead.

As for the API breakage, any function that needs to allocate requires the ability to have limits passed down to it. Basically, this is the entire public API. Either I create a duplicate for each function, or I break the API and just create duplicates for the most important ones. I prefer the latter.

@est31
Copy link
Member

est31 commented Mar 8, 2019

The current changes on master are all backwards-compatible. I'll make a final release of the 0.9 series and then do the breaking changes for an upcoming 0.10 release that's hopefully appeasing your fuzzer, @Shnatsel :).

@Shnatsel
Copy link
Contributor Author

Shnatsel commented Mar 8, 2019

I mean something like this, overflow checks omitted for brevity:

if total_comments_len + comment_len < limit {
    vec::with_capacity(comment_len)
    total_comments_len += comment_len
} else {
    vec::new()
}

For uncompressed content you would not be able to make the program allocate N bytes without actually supplying at least (N/2)+1 bytes of data. This could still be attacked by streaming a very long file over the network, but there would no longer be any 15kb files allocating hundreds of terabytes of RAM.

@est31
Copy link
Member

est31 commented Mar 10, 2019

@Shnatsel yeah that's the approach that you'd use in languages like C but as I wrote above I'd prefer if the checks are one abstraction layer below.

@Shnatsel
Copy link
Contributor Author

There is now support for custom allocators in Vec on nightly: rust-lang/rust#78461

This should make keeping track of allocations a great deal easier.

@est31
Copy link
Member

est31 commented Nov 22, 2020

@Shnatsel if you hover over the PRs description you see me being the first guy who added the 🎉 emoji reaction :). Very glad it's finally happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants