Make conditionals work in debug mode #4738

klecki · 2023-03-23T16:11:16Z

Category: New feature

Description:

Enable using conditionals and Split/Merge in Debug Mode.
Keep batch size tracking when feasible, skip some checks when the tracking is not possible.
TODO:

Add more tests - possibly convert all conditionals tests into debug mode.

Additional information:

Affected modules and functionalities:

Debug Mode, Conditionals

Key points relevant for the review:

Tests:

Existing tests apply
New tests added
- Python tests

- [ ] GTests - [ ] Benchmark - [ ] Other - [ ] N/A

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: N/A

dali/python/nvidia/dali/_conditionals.py

@@ -312,6 +313,21 @@ def track_merge(self, split_predicate):
        self.no_branch()
        self.top().add_produced(split_predicate)

+    def scope_batch_size_tracker(self):


klecki · 2023-03-24T13:38:38Z

dali/python/nvidia/dali/_debug_mode.py

+            assert (batch is None or isinstance(batch, DataNodeDebug),
+                    "Conditionals in debug mode work only with DataNodeDebug")


I don't see namedtuple imported in this line.

dali/python/nvidia/dali/_conditionals.py

dali/python/nvidia/dali/_debug_mode.py

dali/test/python/test_pipeline_debug.py

+from conditionals.test_pipeline_conditionals import (pred_gens, impl_test_against_split_merge,
+                                                     impl_test_dot_gpu,
+                                                     impl_test_arg_inputs_scoped_tracking,
+                                                     impl_test_arg_inputs_scoped_tracking,
+                                                     impl_test_arg_inputs_scoped_uninitialized,
+                                                     impl_test_generators, impl_test_uninitialized)


klecki · 2023-03-24T19:16:48Z

!build

dali-automaton · 2023-03-24T19:21:12Z

CI MESSAGE: [7704282]: BUILD STARTED

dali/test/python/test_pipeline_debug.py

+    # for base_debug, conditional_debug in [(True, False), (False, True), (True, True)]:
+    #    impl_test_generators(pred, {'debug': base_debug}, {'debug': conditional_debug})


dali-automaton · 2023-03-24T21:23:40Z

CI MESSAGE: [7704282]: BUILD PASSED

dali/pipeline/operator/eager_operator.h

klecki · 2023-03-27T13:35:33Z

!build

dali-automaton · 2023-03-27T13:40:45Z

CI MESSAGE: [7724827]: BUILD STARTED

klecki · 2023-03-27T14:30:57Z

!build

dali-automaton · 2023-03-27T14:35:21Z

CI MESSAGE: [7725236]: BUILD STARTED

dali-automaton · 2023-03-27T17:08:34Z

CI MESSAGE: [7724827]: BUILD FAILED

dali-automaton · 2023-03-27T19:51:20Z

CI MESSAGE: [7725236]: BUILD PASSED

dali/python/nvidia/dali/_debug_mode.py

mzient · 2023-03-28T08:02:05Z

dali/python/nvidia/dali/_debug_mode.py

+            transferred_node = DataNodeDebug(self._data._as_gpu(), self_split.name, "gpu",
+                                             self_split.source)
+            _conditionals.register_data_nodes(transferred_node, [self])


Nitpick: is it really always transferred? If the node is already on GPU, it's a no-op, so perhaps calling it a gpu_node would be more accurate (see the special case below - it seems like it isn't handled here).

I see a typo as well, I should copy the split node, adjusted + a shortcut.

klecki · 2023-03-29T09:09:52Z

!build

dali-automaton · 2023-03-29T09:18:36Z

CI MESSAGE: [7748825]: BUILD STARTED

dali-automaton · 2023-03-29T13:18:27Z

CI MESSAGE: [7748825]: BUILD PASSED

mzient · 2023-03-30T10:36:44Z

dali/python/nvidia/dali/types.py

+    op = fn.constant(device=device, fdata=fdata, idata=idata, shape=shape, dtype=dtype,
+                     layout=layout, **kwargs)
+    return op


👍
You can take it one step further:

Suggested change

op = fn.constant(device=device, fdata=fdata, idata=idata, shape=shape, dtype=dtype,

layout=layout, **kwargs)

return op

return fn.constant(device=device, fdata=fdata, idata=idata, shape=shape, dtype=dtype,

layout=layout, **kwargs)

mzient · 2023-03-30T10:37:03Z

dali/python/nvidia/dali/types.py

@@ -505,7 +505,7 @@ def _type_from_value_or_list(v):
        if dtype is None:
            dtype = actual_type

-    import nvidia.dali.ops as ops
+    import nvidia.dali.fn as fn


Needs a lot of cleanup out of debug info. Keep things wrapped in DataNodes. Consider the tracking of batch sizes? We can keep track of bs in scopes maybe, but with split/merge only it is not possible. Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

This is a bit too much workarounds for my taste Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

The test is not feasible with the asserts, as the debug mode generates diffenrent naming Generator test is mismatching batch sizes for some splits, so there is still something wrong Generating nodes outside pipeline won't work with current implementation. TODO: check if there can be naming collision if we use the same op several times Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

banasraf · 2023-04-03T15:49:12Z

dali/test/python/conditionals/test_pipeline_conditionals.py

    test_data_root = get_dali_extra_path()
    caffe_db_folder = os.path.join(test_data_root, 'db', 'lmdb')
    bs = 10
    kwargs = {"batch_size": bs, "num_threads": 4, "device_id": 0}

-    @experimental.pipeline_def(enable_conditionals=True, **kwargs)
+    @experimental.pipeline_def(enable_conditionals=True, **kwargs, **additional_additional_kwargs)


why additional_additional_kwargs?

Search & replace mistake, fixed.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

klecki · 2023-04-03T15:54:17Z

!build

dali-automaton · 2023-04-03T16:00:30Z

CI MESSAGE: [7802285]: BUILD STARTED

dali-automaton · 2023-04-03T18:01:40Z

CI MESSAGE: [7802285]: BUILD PASSED

mzient · 2023-04-06T12:38:19Z

dali/test/python/conditionals/test_pipeline_conditionals.py

+    assert (_conditionals._data_node_repr(some_nested_op) == _conditionals._data_node_repr(
+        preprocessed))


mzient · 2023-04-06T13:02:17Z

dali/test/python/test_pipeline_debug.py

+        return pred, output
+
+    with assert_raises(
+            ValueError, glob=("Debug mode for conditionals doesn't allow for modification of"


Suggested change

ValueError, glob=("Debug mode for conditionals doesn't allow for modification of"

ValueError, glob=("Debug mode for conditional execution doesn't allow for modification of"

"conditionals" is our internal jargon.

mzient · 2023-04-06T13:10:53Z

dali/python/nvidia/dali/_debug_mode.py

+
+        if _conditionals.conditionals_enabled():
+            if input_set_len != -1:
+                raise ValueError("Multiple input sets are not supported with conditionals.")


Suggested change

raise ValueError("Multiple input sets are not supported with conditionals.")

raise ValueError("Multiple input sets are not supported with conditional execution.")

mzient · 2023-04-06T13:11:19Z

dali/python/nvidia/dali/_debug_mode.py

+            # TODO(klecki): Add better handling of constant nodes for conditionals in debug mode.
+            for i, classification in enumerate(self._inputs_classification):
+                if classification.is_batch and not classification.was_data_node:
+                    raise ValueError(f"Debug mode for conditionals doesn't allow for modification"


Suggested change

raise ValueError(f"Debug mode for conditionals doesn't allow for modification"

raise ValueError(f"Debug mode for conditional execution doesn't allow for modification"

mzient · 2023-04-06T13:11:32Z

dali/python/nvidia/dali/_debug_mode.py

+
+            for key, classification in self._kwargs_classification.items():
+                if classification.is_batch and not classification.was_data_node:
+                    raise ValueError(f"Debug mode for conditionals doesn't allow for modification"


Suggested change

raise ValueError(f"Debug mode for conditionals doesn't allow for modification"

raise ValueError(f"Debug mode for conditional execution doesn't allow for modification"

mzient

Please change the messages not to use our jargon - otherwise good.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

klecki · 2023-04-06T14:32:03Z

!build

dali-automaton · 2023-04-06T14:35:31Z

CI MESSAGE: [7838899]: BUILD STARTED

dali-automaton · 2023-04-06T18:44:48Z

CI MESSAGE: [7838899]: BUILD PASSED

github-advanced-security bot found potential problems Mar 23, 2023

View reviewed changes

dali/python/nvidia/dali/_conditionals.py Fixed Show fixed Hide fixed

dali/python/nvidia/dali/_debug_mode.py Fixed Show fixed Hide fixed

jantonguirao assigned mzient and banasraf Mar 24, 2023

github-advanced-security bot found potential problems Mar 24, 2023

View reviewed changes

dali/test/python/test_pipeline_debug.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Mar 24, 2023

View reviewed changes

dali/test/python/test_pipeline_debug.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Mar 24, 2023

View reviewed changes

mzient reviewed Mar 27, 2023

View reviewed changes

dali/pipeline/operator/eager_operator.h Show resolved Hide resolved

mzient reviewed Mar 28, 2023

View reviewed changes

dali/python/nvidia/dali/_debug_mode.py Outdated Show resolved Hide resolved

mzient reviewed Mar 28, 2023

View reviewed changes

mzient reviewed Mar 30, 2023

View reviewed changes

klecki added the conditional execution Questions related to conditional execution and using `if` statements in DALI label Mar 30, 2023

klecki added 4 commits April 3, 2023 17:13

Cleanup & reverts

50e70da

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

More cleanup

3033e8d

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Close to working

fce7db9

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

klecki added 10 commits April 3, 2023 17:15

Generalize some tests, there are still issues with tracing some nodes

fa399c9

This is a bit too much workarounds for my taste Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Repeated op naming test

69968c9

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

some stuff works

df65b54

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

LINT LINT LINT

2fd02fd

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Improve detection

9942cbd

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

It works, the only missing piece is constant promotions

1b7a9a0

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Lint, todos

3c4d2fd

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Linter

e7c3f3b

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Review fixes

004592f

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

klecki force-pushed the conditionals-in-debug-mode branch from 51fa7b3 to 004592f Compare April 3, 2023 15:17

Simplify return

7254238

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

banasraf reviewed Apr 3, 2023

View reviewed changes

banasraf approved these changes Apr 3, 2023

View reviewed changes

Fix typo in test

435af6b

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

mzient reviewed Apr 6, 2023

View reviewed changes

mzient approved these changes Apr 6, 2023

View reviewed changes

Adjust naming

d3d3eda

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

klecki merged commit 751b75b into NVIDIA:main Apr 7, 2023

JanuszL mentioned this pull request Sep 6, 2023

Roadmap 2023 #4578

Closed

		assert (batch is None or isinstance(batch, DataNodeDebug),
		"Conditionals in debug mode work only with DataNodeDebug")

		# for base_debug, conditional_debug in [(True, False), (False, True), (True, True)]:
		# impl_test_generators(pred, {'debug': base_debug}, {'debug': conditional_debug})

		assert (_conditionals._data_node_repr(some_nested_op) == _conditionals._data_node_repr(
		preprocessed))

	ValueError, glob=("Debug mode for conditionals doesn't allow for modification of"
	ValueError, glob=("Debug mode for conditional execution doesn't allow for modification of"

	raise ValueError("Multiple input sets are not supported with conditionals.")
	raise ValueError("Multiple input sets are not supported with conditional execution.")

	raise ValueError(f"Debug mode for conditionals doesn't allow for modification"
	raise ValueError(f"Debug mode for conditional execution doesn't allow for modification"

Make conditionals work in debug mode #4738

Make conditionals work in debug mode #4738

Conversation

klecki commented Mar 23, 2023 • edited Loading

Category: New feature

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Choose a reason for hiding this comment

klecki commented Mar 24, 2023

dali-automaton commented Mar 24, 2023

dali-automaton commented Mar 24, 2023

klecki commented Mar 27, 2023

dali-automaton commented Mar 27, 2023

klecki commented Mar 27, 2023

dali-automaton commented Mar 27, 2023

dali-automaton commented Mar 27, 2023

dali-automaton commented Mar 27, 2023

mzient Mar 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klecki commented Mar 29, 2023

dali-automaton commented Mar 29, 2023

dali-automaton commented Mar 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klecki commented Apr 3, 2023

dali-automaton commented Apr 3, 2023

dali-automaton commented Apr 3, 2023

Choose a reason for hiding this comment

mzient Apr 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Apr 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Apr 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Apr 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient left a comment

Choose a reason for hiding this comment

klecki commented Apr 6, 2023

dali-automaton commented Apr 6, 2023

dali-automaton commented Apr 6, 2023

klecki commented Mar 23, 2023 •

edited

Loading

mzient Mar 28, 2023 •

edited

Loading

mzient Apr 6, 2023 •

edited

Loading

mzient Apr 6, 2023 •

edited

Loading

mzient Apr 6, 2023 •

edited

Loading

mzient Apr 6, 2023 •

edited

Loading