Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parsing numpy header #4903

Merged
merged 2 commits into from
Jun 12, 2023
Merged

Conversation

stiepan
Copy link
Member

@stiepan stiepan commented Jun 12, 2023

Category:

Bug fix (non-breaking change which fixes an issue)

Description:

In #4897, I moved the calculation of the header start pointer (header = token_mem.get() + token_len;) down in the function, but setting null-termination character stayed higher, which means that the \0 is set token_len bytes too early.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@JanuszL JanuszL self-assigned this Jun 12, 2023
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@stiepan
Copy link
Member Author

stiepan commented Jun 12, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [8610899]: BUILD STARTED

Comment on lines +206 to +207
ndims = list(range(33))
with tempfile.TemporaryDirectory(prefix=gds_data_root) as test_data_root:
Copy link
Contributor

@mzient mzient Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to test all of them? Wouldn't a small subset like 1, 2, 3, 7, 16, 32 (or something more informed) do? Of course, if it doesn't affect the test time in a substantial way, we can keep all of them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most interesting case is 21, as that has minimal padding after the header, which makes it sensitive to small offset errors. But the test is very simple and lightweight so I went for all the cases.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [8610899]: BUILD PASSED

@stiepan stiepan merged commit 7532f67 into NVIDIA:main Jun 12, 2023
stiepan added a commit that referenced this pull request Jun 12, 2023
* Fix offset at which a null character is added in the header buffer
* Add a test aimed to catch such bugs in header reading and parsing

---------

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
JanuszL pushed a commit to JanuszL/DALI that referenced this pull request Oct 13, 2023
* Fix offset at which a null character is added in the header buffer
* Add a test aimed to catch such bugs in header reading and parsing

---------

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants