Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don't panic in IPC reader if struct child arrays have different lengths #6417

Merged
merged 3 commits into from
Sep 20, 2024

Conversation

alexwilcoxson-rel
Copy link
Contributor

@alexwilcoxson-rel alexwilcoxson-rel commented Sep 18, 2024

Which issue does this PR close?

Closes #6416

Rationale for this change

While we don't expect to receive invalid IPC data via flight. We would like to avoid panics if possible.

What changes are included in this PR?

Use StructArray::try_new vs StructArray::from in ipc reader. StructArray::from impls use StructArray::new. new calls try_new but unwraps.

Are there any user-facing changes?

No

@github-actions github-actions bot added the arrow Changes to the arrow crate label Sep 18, 2024
@alexwilcoxson-rel
Copy link
Contributor Author

Any advice on how to generate this type of invalid batch with arrow-rs for testing?

@@ -31,7 +31,9 @@ use std::io::{BufReader, Read, Seek, SeekFrom};
use std::sync::Arc;

use arrow_array::*;
use arrow_buffer::{ArrowNativeType, BooleanBuffer, Buffer, MutableBuffer, ScalarBuffer};
use arrow_buffer::{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please get a test for this fix so that we don't break it again during some future refactoring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just added, was able to write the invalid array to ipc buffer using ArrayDataBuilder::build_unchecked

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution @alexwilcoxson-rel

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @alexwilcoxson-rel -- this looks great to me

@@ -2235,4 +2237,50 @@ mod tests {

assert_eq!(batch, roundtrip_batch);
}

#[test]
fn test_invalid_struct_array_ipc_read_errors() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I verified that this test panic's without the code changes in this PR, as expected

` value: InvalidArgumentError("Incorrect array length for StructArray field \"b\", expected 4 got 3")
stack backtrace:
   0: rust_begin_unwind
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:665:5
   1: core::panicking::panic_fmt
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:74:14
   2: core::result::unwrap_failed
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/result.rs:1679:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/result.rs:1102:23
   4: arrow_array::array::struct_array::StructArray::new
             at /Users/andrewlamb/Software/arrow-rs/arrow-array/src/array/struct_array.rs:90:9
   5: <arrow_array::array::struct_array::StructArray as core::convert::From<alloc::vec::Vec<(alloc::sync::Arc<arrow_schema::field::Field>,alloc::sync::Arc<dyn arrow_array::array::Array>)>>>::from
             at /Users/andrewlamb/Software/arrow-rs/arrow-array/src/array/struct_array.rs:401:9
   6: arrow_ipc::reader::create_array
             at ./src/reader.rs:167:17
   7: arrow_ipc::reader::read_record_batch_impl

@alamb alamb merged commit 8ab18fd into apache:master Sep 20, 2024
26 checks passed
@alamb alamb changed the title fix: don't panic in IPC reader if struct child arrays have different … fix: don't panic in IPC reader if struct child arrays have different lengths Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Invalid struct arrays in IPC data causes panic during read
2 participants