Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data race when copying outputs in TF plugin #3547

Merged
merged 1 commit into from
Dec 2, 2021

Conversation

klecki
Copy link
Contributor

@klecki klecki commented Nov 30, 2021

Description

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactoring (Redesign of existing code that doesn't affect functionality)
  • Other (e.g. Documentation, Tests, Configuration)

What happened in this PR

DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Additional information

  • Affected modules and functionalities:
    TF plugin

  • Key points relevant for the review:
    Check if sync is enough

Checklist

Tests

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: DALI-2483

DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
@klecki
Copy link
Contributor Author

klecki commented Nov 30, 2021

!build

@JanuszL JanuszL self-assigned this Nov 30, 2021
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3500935]: BUILD STARTED

@klecki
Copy link
Contributor Author

klecki commented Nov 30, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3501189]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3501189]: BUILD FAILED

@klecki
Copy link
Contributor Author

klecki commented Dec 1, 2021

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3506033]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3506033]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [3506033]: BUILD PASSED

@klecki klecki merged commit ef1aff2 into NVIDIA:main Dec 2, 2021
@klecki klecki added the important-fix Fixes an important issue in the software or development environment. label Dec 2, 2021
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jan 23, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Feb 21, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request May 13, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jun 7, 2022
DALI TF Plugin was copying the outputs on
TF's stream and was not synchronizing with
the copy.
In some cases we can overtake the copy
with DALI iteration and overwrite the buffers
before they are actually copied.

Add a flag to synchronize with the stream after
last copy is scheduled so we synchronize only
once.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
important-fix Fixes an important issue in the software or development environment.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants