Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the whole frame when writing rows. #17094

Merged
merged 3 commits into from
Sep 19, 2024

Conversation

gianm
Copy link
Contributor

@gianm gianm commented Sep 17, 2024

This patch makes the following adjustments to enable writing larger single rows to frames:

  1. RowBasedFrameWriter: Max out allocation size on the final doubling.
    i.e., if the final allocation "naturally" would be 1 MiB but the
    max frame size is 900 KiB, use 900 KiB rather than failing the 1 MiB
    allocation.

  2. AppendableMemory: In reserveAdditional, release the last block if it
    is empty. This eliminates waste when a frame writer uses a
    successive-doubling approach to find the right allocation size.

  3. ArenaMemoryAllocator: Reclaim memory from the last allocation when
    the last allocation is closed.

Prior to these changes, a single row could be much smaller than the frame size and still fail to be added to the frame.

This patch makes the following adjustments to enable writing larger
single rows to frames:

1) RowBasedFrameWriter: Max out allocation size on the final doubling.
   i.e., if the final allocation "naturally" would be 1 MiB but the
   max frame size is 900 KiB, use 900 KiB rather than failing the 1 MiB
   allocation.

2) AppendableMemory: In reserveAdditional, release the last block if it
   is empty. This eliminates waste when a frame writer uses a
   successive-doubling approach to find the right allocation size.

3) ArenaMemoryAllocator: Reclaim memory from the last allocation when
   the last allocation is closed.

Prior to these changes, a single row could be much smaller than the
frame size and still fail to be added to the frame.
@gianm gianm added this to the 31.0.0 milestone Sep 17, 2024
@gianm gianm marked this pull request as ready for review September 17, 2024 20:35
@github-actions github-actions bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Sep 18, 2024
Copy link
Contributor

@LakshSingla LakshSingla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I fixed the point mentioned in (1) elsewhere

@gianm
Copy link
Contributor Author

gianm commented Sep 19, 2024

I thought I fixed the point mentioned in (1) elsewhere

I think you mean this change in AppendableMemory: https://github.com/apache/druid/pull/15987/files#r1512928070

The issue (1) from my list is a similar thing in RowBasedFrameWriter.

@gianm gianm merged commit 3d45f98 into apache:master Sep 19, 2024
90 checks passed
@gianm gianm deleted the frame-writer-use-more-memory branch September 19, 2024 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants