Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add origin stack trace capture for DALI operators #5302

Merged
merged 34 commits into from
Mar 4, 2024

Conversation

klecki
Copy link
Contributor

@klecki klecki commented Feb 1, 2024

Category: New feature

Description:

Add capture of origin stack trace for DALI operators.
The extract_stack collects all frames from the first frame of pipeline definition to the frame with operator invocation (the outermost user API).
In regular mode no further processing is needed.
For code that was transformed by AutoGraph, the contents of captured frames are remapped back to the user code, filtered out of the _autograph and _conditionals modules (that contain internal DALI implementation).
Due to how autograph introduces additional frames (for example by implementing if statements with additional function calls), we remove repeated occurrences of the same function, keeping only the last one. AutoGraph's entry point to a function call is used to detect such regions.

Stack traces are collected in OperatorInstance, the outermost layer of the API adds an argument denoting the current stack level, so we can compute how many frames we need to skip related to DALI internals when capturing the stack.

The collected stack is added as hidden arguments to OpSpec. This is backward-compatible for serialized pipelines, as well as allows for disabling and enabling the feature.

As a default the feature is disabled and only available via hidden API.

C++ test operators are extended to work as a loadable plugin based on dummy operator test.
It allows to implement tests for the new feature without extending regular DALI operators.

Follow-up to this PR will utilize this information for providing better error messages - pointing to the origin of error in DALI pipeline definition.

Additional information:

Affected modules and functionalities:

New functionality, extensions for Python API and new hidden arguments in all operators.

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

dali/python/nvidia/dali/_autograph/utils/tf_stack.py Dismissed Show dismissed Hide dismissed
raise
# We no longer need CurrentModuleFilter here, as we filter whole autograph
import nvidia.dali._conditionals as dc
import nvidia.dali._autograph as ag

Check notice

Code scanning / CodeQL

Module is imported with 'import' and 'import from' Note

Module 'nvidia.dali._autograph' is imported with both 'import' and 'import from'.
Module '_autograph' is imported with both 'import' and 'import from'.

def pipe():
if 0:
x = origin_trace()

Check warning

Code scanning / CodeQL

Unreachable code Warning test

This statement is unreachable.
Some parts are rewritten in DALI in following commits

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
The solution to stack-trace manipulation in TF was optimized by moving
parts of the processing to C++. As we have limited needs for producing
traces, it is sufficient to replace them with regulart Python
constructs. This steps reenables the construction of StackTraceMapper
and Filter in the AG-converted code.

It will enable next steps:
* we will be capturing origin stack trace for every operator defintion
* in conditional mode we will remap it back to the user code from the
  AG-produced one.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
@klecki
Copy link
Contributor Author

klecki commented Feb 28, 2024

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13153535]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13153535]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13153535]: BUILD PASSED

klecki added a commit to klecki/DALI that referenced this pull request Feb 29, 2024
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
klecki added a commit to klecki/DALI that referenced this pull request Feb 29, 2024
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
@klecki
Copy link
Contributor Author

klecki commented Mar 4, 2024

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13262456]: BUILD STARTED

klecki added a commit to klecki/DALI that referenced this pull request Mar 4, 2024
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [13262456]: BUILD PASSED

@klecki klecki merged commit 83e61e5 into NVIDIA:main Mar 4, 2024
6 of 7 checks passed
@klecki klecki deleted the origin-stack-trace branch March 4, 2024 20:47
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants