Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove possibility of access to contiguous TL buffer #3373

Merged
merged 5 commits into from
Sep 27, 2021

Conversation

klecki
Copy link
Contributor

@klecki klecki commented Sep 24, 2021

Description

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactoring (Redesign of existing code that doesn't affect functionality)
  • Other (e.g. Documentation, Tests, Configuration)

What happened in this PR

Keep escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Additional information

  • Affected modules and functionalities:
    Buffer, Tensor, Tensor List, Pipeline outputs

  • Key points relevant for the review:

Checklist

Tests

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: DALI-2255

@@ -86,71 +86,6 @@ class DLL_PUBLIC Buffer {
inline Buffer() = default;
virtual ~Buffer() = default;

/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just moved from public: to protected:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like it. The Buffer class is really quite useless after this change. How about making the inheritance of TensorList from Buffer protected (or even private) instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useless -> redundant ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to private inheritance.

dali/pipeline/data/tensor.h Outdated Show resolved Hide resolved
@JanuszL JanuszL self-assigned this Sep 24, 2021

/**
* @brief Returns a typed pointer to the underlying storage. If the
* buffer has not been allocated because it does not yet have a type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* buffer has not been allocated because it does not yet have a type,
* tensor has not been allocated because it does not yet have a type,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is the Tensor API I would stick to using tensor in the docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied the old Doxygen verbatim from tensor, I can adjust if you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would appreciate that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All removed, will resolve stale comments.

*
* If the buffer already has a valid type, and the calling type does
* not match, the type of the buffer is reset and the underlying
* storage is re-allocated if the buffer does not currently own
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* storage is re-allocated if the buffer does not currently own
* storage is re-allocated if the tensor does not currently own

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3049105]: BUILD STARTED


/**
* @brief Return an un-typed pointer to the underlying storage.
* Buffer must be either empty or have a valid type and be contiguous.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Buffer must be either empty or have a valid type and be contiguous.
* The memory must be either empty or have a valid type and be contiguous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


/**
* @brief Return an un-typed const pointer to the underlying storage.
* Buffer must be either empty or have a valid type and be contiguous.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Buffer must be either empty or have a valid type and be contiguous.
* The memory must be either empty or have a valid type and be contiguous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


// Check the internals
ASSERT_NE(tensor_list->template mutable_data<float>(), nullptr);
ASSERT_TRUE(tensor_list->has_data());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you specialize has_data to the TensorList now (as it is implemented in the buffer)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I maybe can do something, but I think it needs to wait for the proper changes.


// Check the internals
ASSERT_EQ(tensor_list.ntensor(), shape.size());
for (size_t i = 0; i < tensor_list.ntensor(); ++i) {
// ASSERT_EQ(ptrs[i], tensor_list.raw_tensor(i));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

ASSERT_EQ(tensor_list.tensor_shape(i), shape[i]);
ASSERT_EQ(tensor_list.tensor_offset(i), offsets[i]);
}

// No memory allocation should have occured
ASSERT_EQ(ptr, tensor_list.raw_data());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can save ptrs to each tensor and then check if no reallocation has happened.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot check if no reallocation happen this way, as it will match only for the first pointer. I can add the unsafe call.

DeviceGuard d(src.device_id());
const auto &type_info = src.type_info();

// TODO(klecki): Add a proper test for non-contiguous access when we can have non-contiguous
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do think we will have such case?
I imagine that DALI still should return raw outputs as continuous tensors so we can wrap them into CuPy/NumPy or DLPack directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wrote it for the sake of TODO for the next PR mostly, I guess we won't be returning non-contiguous stuff soon, but either way I will add an error here or test this code-path and error out somewhere else.

Comment on lines 68 to 81
const auto &src_shape = src.shape();
auto *dst_buf = static_cast<uint8_t *>(dst); SmallVector<void *, 256> to;
SmallVector<const void *, 256> from;
SmallVector<int64_t, 256> sizes;
int num_samples = src_shape.num_samples();
sizes.reserve(num_samples);
to.reserve(num_samples);
from.reserve(num_samples);
for (int i = 0; i < num_samples; i++) {
sizes.push_back(src_shape.tensor_size(i));
to.push_back(dst_buf);
dst_buf += sizes[i] * type_info.size();
from.push_back(src.raw_tensor(i));
}

type_info.template Copy<DstBackend, SrcBackend>(to.data(), from.data(), sizes.data(),
num_samples, stream, use_copy_kernel);
}
Copy link
Contributor

@JanuszL JanuszL Sep 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const auto &src_shape = src.shape();
auto *dst_buf = static_cast<uint8_t *>(dst); SmallVector<void *, 256> to;
SmallVector<const void *, 256> from;
SmallVector<int64_t, 256> sizes;
int num_samples = src_shape.num_samples();
sizes.reserve(num_samples);
to.reserve(num_samples);
from.reserve(num_samples);
for (int i = 0; i < num_samples; i++) {
sizes.push_back(src_shape.tensor_size(i));
to.push_back(dst_buf);
dst_buf += sizes[i] * type_info.size();
from.push_back(src.raw_tensor(i));
}
type_info.template Copy<DstBackend, SrcBackend>(to.data(), from.data(), sizes.data(),
num_samples, stream, use_copy_kernel);
}
const auto &src_shape = src.shape();
SmallVector<const void *, 256> from;
SmallVector<int64_t, 256> sizes;
int num_samples = src_shape.num_samples();
sizes.reserve(num_samples);
to.reserve(num_samples);
from.reserve(num_samples);
for (int i = 0; i < num_samples; i++) {
sizes.push_back(src_shape.tensor_size(i));
from.push_back(src.raw_tensor(i));
}
type_info.template Copy<DstBackend, SrcBackend>(dst, from.data(), sizes.data(),
num_samples, stream, use_copy_kernel);
}

As we have:

template <typename DstBackend, typename SrcBackend>
void TypeInfo::Copy(void *dst, const void** srcs, const Index* sizes, int n,
                    cudaStream_t stream, bool use_copy_kernel) const {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3049105]: BUILD PASSED

@klecki
Copy link
Contributor Author

klecki commented Sep 27, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3063967]: BUILD STARTED

"Can only wait on user streams");
DeviceGuard g(dev);
CUDA_CALL(cudaStreamSynchronize(streams_[dev]));
DLL_PUBLIC void WaitForDevice(const dali::Tensor<GPUBackend> &t) {
Copy link
Contributor

@JanuszL JanuszL Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DLL_PUBLIC void WaitForDevice(const dali::Tensor<GPUBackend> &t) {
template<template<GPUBackend>class Container>
DLL_PUBLIC void WaitForDevice(const Container<GPUBackend> &t) {

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the other alternative, I just went with two overloads, I guess a coin flip to decide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3063967]: BUILD FAILED

Keep escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
The type was previously implicit in the allocation

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
@klecki
Copy link
Contributor Author

klecki commented Sep 27, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3064661]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3064661]: BUILD PASSED

Comment on lines +86 to +89
using Buffer<Backend>::SetGrowthFactor;
using Buffer<Backend>::SetShrinkThreshold;
using Buffer<Backend>::GetGrowthFactor;
using Buffer<Backend>::GetShrinkThreshold;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need those 4?

@klecki klecki merged commit ddabf7d into NVIDIA:main Sep 27, 2021
cyyever pushed a commit to cyyever/DALI that referenced this pull request Oct 17, 2021
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Feb 21, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request May 13, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jun 7, 2022
Change TensorList Buffer inheritance to private,
and reexpose the old API. Keep the buffer-access methods
private.

Add escape-hatch functions for the purpose of Pipeline
output.

This is intended as intermediate step, the main
purpose is to not introduce the access to the underlying
buffer again into the code-base.

Get rid of those functions from TL tests.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants