Add deferred deallocation to `cuda_vm_resource`. #3154

mzient · 2021-07-15T16:46:15Z

Signed-off-by: Michał Zientkiewicz mzient@gmail.com

Description

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Refactoring (Redesign of existing code that doesn't affect functionality)
Other (e.g. Documentation, Tests, Configuration)

What happened in this PR

Factor out deferred dealloction from pool resource to a standalone file.
Add deferred deallocation to cuda_vm_resource.
Add more asserts to free_tree.
Improve free_tree API
Flush deallocations in tests, where necessary.

Additional information

Affected modules and functionalities:

Key points relevant for the review:

Checklist

Tests

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: DALI-2190

* Factor out deferred dealloction from pool resource to a standalone file. * Add deferred deallocation to `cuda_vm_resource`. * Add more asserts to `free_tree`. * Improve free_tree API * Flush deallocations in tests, where necessary. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

dali-automaton · 2021-07-15T16:48:33Z

CI MESSAGE: [2592782]: BUILD STARTED

mzient · 2021-07-15T18:06:02Z

include/dali/core/mm/cuda_vm_resource.h

+    ptr = free_mapped_.get_specific_block(va, size);
+    assert(ptr == va);
+    stat_take_free(size);
+    mark_as_unavailable(ptr, size);


This was the bug - previously this code was ptr = try_get_mapped(size, alignment) - if somehow it got a block that wasn't entirely covered by va there was inconsistency in what's free and what is not and the integrity of the pool was destroyed. The bug was exposed when flush_deferred() wasn't followed by another attempt to get mapped already memory. Now this is hidden again (by the second attempt to get memory from free_mapped), but it was dangerous anyway, since the agreement of blocks from free_va and free_mapped is coincidental.

dali-automaton · 2021-07-15T18:14:54Z

CI MESSAGE: [2592782]: BUILD PASSED

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

jantonguirao · 2021-07-19T08:18:40Z

dali/core/mm/cuda_vm_resource_test.cc

@@ -28,7 +28,7 @@ namespace test {
 class VMResourceTest : public ::testing::Test {
 public:
  void TestAlloc() {
-    cuda_vm_resource res;
+    cuda_vm_resource_base res;


Just a thought: when I read cuda_vm_resource_base it reads to me as if it was some kind of base implementation that is not meant to be instantiated directly. I don't have a much better suggestion for a name, so feel free to ignore.

Well, it's not meant in the sense that it doesn't have a very important feature (deferred deallocation) - but it's nonetheless usable and testable.

jantonguirao · 2021-07-19T08:20:31Z

dali/core/mm/cuda_vm_resource_test.cc

@@ -187,6 +187,7 @@ class VMResourceTest : public ::testing::Test {
    void *p1 = res.allocate(block_size);  // allocate one block
    void *p2 = res.allocate(block_size);  // allocate another block
    res.deallocate(p2, block_size);       // now free the second block
+    res.flush_deferred();


Now looking at this, is res meant to be cuda_vm_resource_base or the one with deferred allocation (cuda_vm_resource)?

This test uses the one with deferred deallocation - that's why this line is needed - otherwise we might not see the result of the deallocation and the test expectations will fail.

jantonguirao · 2021-07-19T08:28:19Z

include/dali/core/mm/cuda_vm_resource.h

+  void synchronize(span<const dealloc_params> params) {
+    assert(device_ordinal_ >= 0 && "synchronize called before the resource initialization");
+    for (auto &p : params) {
+      if (p.sync_device >= 0 && p.sync_device < device_ordinal_)


Suggested change

if (p.sync_device >= 0 && p.sync_device < device_ordinal_)

if (p.sync_device >= 0 && p.sync_device != device_ordinal_)

Shouldn't it be equal?

jantonguirao · 2021-07-19T08:35:58Z

include/dali/core/mm/cuda_vm_resource.h

  void *try_get_mapped(size_t size, size_t alignment) {
    char *ptr = static_cast<char*>(free_mapped_.get(size, alignment));
    if (ptr) {
      stat_take_free(size);
-      free_va_.get_specific_block(ptr, ptr + size);
+      auto *va = free_va_.get_specific_block(ptr, size);
+      (void)va;


is this to silence a warning in release build?

Yes - the assertion will compile to nothing.

dali-automaton · 2021-07-19T11:58:29Z

CI MESSAGE: [2609268]: BUILD STARTED

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

dali-automaton · 2021-07-19T12:51:27Z

CI MESSAGE: [2609654]: BUILD STARTED

dali-automaton · 2021-07-19T14:06:01Z

CI MESSAGE: [2609268]: BUILD FAILED

dali-automaton · 2021-07-19T14:15:34Z

CI MESSAGE: [2609654]: BUILD PASSED

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

dali-automaton · 2021-07-19T14:26:04Z

CI MESSAGE: [2610050]: BUILD STARTED

dali-automaton · 2021-07-19T14:56:07Z

CI MESSAGE: [2610158]: BUILD STARTED

dali-automaton · 2021-07-19T17:36:19Z

CI MESSAGE: [2610158]: BUILD PASSED

mzient commented Jul 15, 2021

View reviewed changes

jantonguirao assigned jantonguirao and unassigned jantonguirao Jul 16, 2021

Add missing trait to some classes.

04befb3

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

jantonguirao assigned szalpal Jul 19, 2021

jantonguirao reviewed Jul 19, 2021

View reviewed changes

jantonguirao approved these changes Jul 19, 2021

View reviewed changes

Fix a minor bug.

2bcdb9b

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Flush all deferred deallocations in the tests.

fc52107

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

mzient unassigned szalpal Jul 20, 2021

jantonguirao assigned banasraf Jul 20, 2021

banasraf approved these changes Jul 20, 2021

View reviewed changes

mzient merged commit 52e36da into NVIDIA:main Jul 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add deferred deallocation to `cuda_vm_resource`. #3154

Add deferred deallocation to `cuda_vm_resource`. #3154

mzient commented Jul 15, 2021 •

edited

Loading

dali-automaton commented Jul 15, 2021

mzient Jul 15, 2021 •

edited

Loading

dali-automaton commented Jul 15, 2021

jantonguirao Jul 19, 2021

mzient Jul 19, 2021

jantonguirao Jul 19, 2021

mzient Jul 19, 2021

jantonguirao Jul 19, 2021

mzient Jul 19, 2021

jantonguirao Jul 19, 2021

mzient Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

	if (p.sync_device >= 0 && p.sync_device < device_ordinal_)
	if (p.sync_device >= 0 && p.sync_device != device_ordinal_)

Add deferred deallocation to cuda_vm_resource. #3154

Add deferred deallocation to cuda_vm_resource. #3154

Conversation

mzient commented Jul 15, 2021 • edited Loading

Description

What happened in this PR

Additional information

Checklist

Tests

Documentation

DALI team only

Requirements

dali-automaton commented Jul 15, 2021

mzient Jul 15, 2021 • edited Loading

Choose a reason for hiding this comment

dali-automaton commented Jul 15, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

dali-automaton commented Jul 19, 2021

Add deferred deallocation to `cuda_vm_resource`. #3154

Add deferred deallocation to `cuda_vm_resource`. #3154

mzient commented Jul 15, 2021 •

edited

Loading

mzient Jul 15, 2021 •

edited

Loading