Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type hints for ExternalSource, Python Functions, TFRecord Reader, math module #5118

Merged
merged 7 commits into from
Nov 3, 2023

Conversation

klecki
Copy link
Contributor

@klecki klecki commented Oct 24, 2023

Category: New feature

Description:

Add annotations to operators with custom Python wrappers:

  • External source
  • TFRecord reader
  • Python function family
  • nvidia.dali.math module
  • ops.Compose

Add stub file for nvidia.dali.tfrecord module.

Add return type annotations to the Pipeline.run function.
The TensorList types again are not fully visible due to being generated
by backend at runtime, next step should provide a dedicated stub file
or alternative implementation.

The stubs are based on the output of mypy stubgen and the
nvidia.dali.ops._signatures._gen_[fn/ops]_signature.

The stub generation is reworked, first grouping the operators into
4 categories, so the generated stubs contain imports first.
We utilize the fact that the operators with custom Python wrappers
have dedicated implementation modules now, and reexpose them
in the stubs, allowing the type hints to be picked up.

The external source has non-trivial defaults, expressed mostly via None
and cross-dependent on other parameters, so creating a meaningful
annotations there is hard.
There are two overloads provided for the external_source function,
allowing to disambiguate between the single and multiple outputs
(when num_outputs parameter was used, and the return type is
always a tuple). Such distinction can't be easily made in the ops API.

TODO:

  • Numba and Pytorch function are left for a followup as they require
    a lot of code movement to accommodate for the module structure.

Additional information:

Affected modules and functionalities:

Several operators, Pipeline.run, stub generation.

Key points relevant for the review:

If the annotations are ok.

Tests:

  • Existing tests apply
    Whole DALI must work
  • New tests added
    • Python tests
      Trivial mypy test with external source.
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Rework the order of stub generation

Split the ops into categories, first do the reimports of python
wrappers, than do the generated ones.

TODO(klecki): Revert the broken `__module__`? I need the access to the
source of the implementation

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

WIP generation

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Write the stub files by hand

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Compose - we don;t have anything to constraint as Operetor parent class

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Add type hints for TFRecord related types

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>

Add test with external source

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
@klecki klecki marked this pull request as ready for review October 26, 2023 14:21
@stiepan stiepan self-assigned this Oct 26, 2023
@klecki
Copy link
Contributor Author

klecki commented Oct 26, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10453001]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10453001]: BUILD PASSED

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
@klecki
Copy link
Contributor Author

klecki commented Nov 3, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10577941]: BUILD STARTED

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
@klecki
Copy link
Contributor Author

klecki commented Nov 3, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10578554]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10577941]: BUILD PASSED

@@ -83,8 +82,8 @@ class DLTensorPythonFunction:
/,
*,
function: Optional[Callable[..., Union[Any, Sequence[Any]]]] = None,
batch_processing: Optional[bool] = False,
num_outputs: Optional[int] = 1,
batch_processing: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be True too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably it should

@@ -100,14 +99,13 @@ class DLTensorPythonFunction:
/,
*input: DataNode,
function: Optional[Callable[..., Union[Any, Sequence[Any]]]] = None,
batch_processing: Optional[bool] = False,
num_outputs: Optional[int] = 1,
batch_processing: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be True too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed those arguments, as I don't think they will work correctly.
We need to consider revisiting the custom implementations for ops, as they do not follow the "pass any argument anywhere" flow.

Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
@klecki
Copy link
Contributor Author

klecki commented Nov 3, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10579585]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10579585]: BUILD PASSED

@klecki klecki merged commit 1327f5b into NVIDIA:main Nov 3, 2023
5 checks passed
@klecki klecki added the signatures Python API signatures and type hints label Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
signatures Python API signatures and type hints
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants