Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(scale): implement max length constraint for decoding bytes #2676

Closed
wants to merge 1 commit into from

Conversation

edwardmack
Copy link
Member

Changes

  • Add maxLength constraint to decodeState where error is raised if decodeState.decodeLength is greater than decodeState.maxLength to avoid trying to decode very long bytes values.
  • Add constant defaultMaxLength of 16 GB to use as default maxLength value, so that decoder will not attempt to decode bytes encoded with a length of greater than 16 GB.
  • Updated function NewDecoder and Unmarshal to accept variadic option, following pattern explained in https://dave.cheney.net/2014/10/17/functional-options-for-friendly-apis
  • Added function OptionMaxLength for overriding default maxLength
  • Added unit tests for testing use of OptionMaxLength with NewDecoder and Unmarshal
  • Added unit tests for testing decoding bytes with length greater than defaultMaxLength

Tests

go test github.com/ChainSafe/gossamer/pkg/scale/

Issues

fixes #2629, which was causing out of memory errors when trying to decode very long bytes encodings.

Primary Reviewer

@timwu20

@codecov
Copy link

codecov bot commented Jul 15, 2022

Codecov Report

Merging #2676 (8bc1697) into development (3a471d9) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@               Coverage Diff               @@
##           development    #2676      +/-   ##
===============================================
+ Coverage        62.98%   62.99%   +0.01%     
===============================================
  Files              211      211              
  Lines            27136    27155      +19     
===============================================
+ Hits             17091    17107      +16     
- Misses            8482     8484       +2     
- Partials          1563     1564       +1     
Flag Coverage Δ
unit-tests 62.99% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/scale/decode.go 66.55% <100.00%> (+1.14%) ⬆️
dot/network/inbound.go 92.98% <0.00%> (-7.02%) ⬇️
dot/network/block_announce.go 58.40% <0.00%> (-4.81%) ⬇️
lib/runtime/wasmer/imports.go 55.26% <0.00%> (+0.05%) ⬆️
dot/network/message.go 68.39% <0.00%> (+2.59%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3a471d9...8bc1697. Read the comment docs.

@@ -56,8 +56,10 @@ func indirect(dstv reflect.Value) (elem reflect.Value) {
return
}

const defaultMaxLength = 16 * 1024 * 1024 * 1024 // 16GB
Copy link
Contributor

@qdm12 qdm12 Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest we don't limit the decoding by default. I think that's also why the scale spec doesn't specify a limit, since 20GB of scale data is still, well, valid.

Also if the writer is a file, 16GB+ is fine. The problem is with memory, depending on the environment.

// Unmarshal takes data and a destination pointer to unmarshal the data to.
func Unmarshal(data []byte, dst interface{}) (err error) {
func Unmarshal(data []byte, dst interface{}, options ...func(*Decoder)) (err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • You should define a type type Option func(decoder *Decoder) at global scope
  • Actually that option function should return an error in case of future conflicting options for example (since it's meant to be a stable package go API). The error should be checked in the for _, option := range options { loop

}
return
}

// OptionMaxLength sets the maximum length of the data to be decoded
func OptionMaxLength(maxLength int) func(*Decoder) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this argument a uint, to ensure the user cannot plug in negative values.

type decodeState struct {
io.Reader
maxLength int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's gonna be a niche nit...

getting the size through file.Stat() in Go returns a int64;
maybe we should have this maxLength as int64 if i.e. the writer is a file, and the OS is 32 bit (then int = int32)

@@ -540,6 +562,9 @@ func (ds *decodeState) decodeBytes(dstv reflect.Value) (err error) {
if err != nil {
return
}
if length > ds.maxLength {
return fmt.Errorf("could not decode length %v greater than max length %v", length, ds.maxLength)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this a sentinel error. I understand this package doesn't use sentinel errors, but that would help one more step for #2631

Comment on lines +131 to +133
return func(options *Decoder) {
options.maxLength = maxLength
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an error would be useful here to prevent the user from settings a limit of 0 for example here.

Comment on lines +164 to +166
if !reflect.DeepEqual(dst, tt.in) {
t.Errorf("decodeState.unmarshal() = %#v, want %#v", dst, tt.in)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use assert.Equal

(also the t.Errorf message is wrong here)

@@ -70,6 +71,103 @@ func Test_decodeState_decodeBytes(t *testing.T) {
}
}

var maxLengthDecodeTests = tests{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this a function returning a tests, so we can be (more) sure there is no funny mutation between each test using it.

}

func Test_decodeState_decodeBytes_maxLength(t *testing.T) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
t.Parallel()

Copy link
Contributor

@kishansagathiya kishansagathiya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Do we have places where Unmarshal with options is getting used?
    I like the idea of option function. It is helpful if we have a bunch of options.

func NewDecoder(r io.Reader) (d *Decoder) {
d = &Decoder{
decodeState{r},
func NewDecoder(r io.Reader, options ...func(*Decoder)) (d *Decoder) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the option here for exactly?
Are they like helper functions for decoding? In which case, may be they could be passed only here and be saved as property of decoder and use them in the Decode function. If that makes sense, decode function would not need to be passed options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Although you would still have to pass options to scale.Unmarshal which creates its decoder within AFAIK.

@edwardmack
Copy link
Member Author

Closed, replaced by PR #2683

@edwardmack edwardmack closed this Jul 20, 2022
@edwardmack edwardmack deleted the ed/scale_memory_bug branch November 11, 2022 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: payload makes scale decoder go out of memory
3 participants