Add Flax training example #4978

awolant · 2023-08-04T09:42:21Z

Category:

New feature

Description:

New tutorial showing how to train neural network implemented in Flax with DALI. It builds on the JAX training tutorial - significant parts are the same.

Additional information:

Affected modules and functionalities:

JAX documentation.

Key points relevant for the review:

Is it understandable? Spelling, grammar?

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: DALI-3564

Signed-off-by: Albert Wolant <awolant@nvidia.com>

review-notebook-app · 2023-08-04T09:42:25Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Signed-off-by: Albert Wolant <awolant@nvidia.com>

awolant · 2023-08-04T09:50:57Z

!build

dali-automaton · 2023-08-04T09:55:15Z

CI MESSAGE: [9239937]: BUILD STARTED

dali-automaton · 2023-08-04T22:20:28Z

CI MESSAGE: [9239937]: BUILD FAILED

banasraf · 2023-08-07T07:46:42Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

+    "\n",
+    "This simple example shows how to train a neural network implemented in Flax with DALI pipelines. If you want to learn more about training neural networks with Flax look into [Flax Getting Started](https://flax.readthedocs.io/en/latest/getting_started.html) example.\n",
+    "\n",
+    "DALI setup is very similar to the [training example with pure JAX](jax-basic_example.ipynb). The only difference is the addition of a trailing dimention to the returned image to make it compatible with Flax convolutions. If you are familiar with how to use DALI with JAX you can skip this part and move to the training section of this notebook.\n",


Suggested change

"DALI setup is very similar to the [training example with pure JAX](jax-basic_example.ipynb). The only difference is the addition of a trailing dimention to the returned image to make it compatible with Flax convolutions. If you are familiar with how to use DALI with JAX you can skip this part and move to the training section of this notebook.\n",

"DALI setup is very similar to the [training example with pure JAX](jax-basic_example.ipynb). The only difference is the addition of a trailing dimension to the returned image to make it compatible with Flax convolutions. If you are familiar with how to use DALI with JAX, you can skip this part and move to the training section of this notebook.\n",

banasraf · 2023-08-07T07:48:11Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

+    "**Here is a quick explnation of how these parameters work:**\n",
+    "\n",
+    " - `output_map`: iterators return a dictionary with outputs of the pipeline as its values. Keys in this dictionary are defined by `output_map`. For example, `labels` output returned from the DALI pipeline defined above will be accessible as `iterator_output['labels']`,\n",
+    " - `reader_name`: setting this parameter introduces the notion of an epoch to our iterator. DALI pipeline itself is infinite, it will return the data indefinately, wrapping around the dataset. DALI readers (such as `fn.readers.caffe2` used in this example) have access to the information about the size of the dataset. If we want to pass this information to the iterator we need to point to the operator that should be queried for the dataset size. We do it by naming the operator (note `name=\"mnist_caffe2_reader\"`) and passing the same name as the value for `reader_name` argument,\n",


Suggested change

" - `reader_name`: setting this parameter introduces the notion of an epoch to our iterator. DALI pipeline itself is infinite, it will return the data indefinately, wrapping around the dataset. DALI readers (such as `fn.readers.caffe2` used in this example) have access to the information about the size of the dataset. If we want to pass this information to the iterator we need to point to the operator that should be queried for the dataset size. We do it by naming the operator (note `name=\"mnist_caffe2_reader\"`) and passing the same name as the value for `reader_name` argument,\n",

" - `reader_name`: setting this parameter introduces the notion of an epoch to our iterator. DALI pipeline itself is infinite, it will return the data indefinately, wrapping around the dataset. DALI readers (such as `fn.readers.caffe2` used in this example) have access to the information about the size of the dataset. If we want to pass this information to the iterator, we need to point to the operator that should be queried for the dataset size. We do it by naming the operator (note `name=\"mnist_caffe2_reader\"`) and passing the same name as the value for `reader_name` argument,\n",

banasraf · 2023-08-07T07:48:40Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

+    "\n",
+    "Now we need to setup model and training utilities. The goal of this notebook is not to explain Flax concepts. We want to show how to train models implemented in Flax with DALI as a data loading and preprocessing library. We used standard Flax tools do define simple neural network. We have functions to create an instance of this network, run one training step on it and calculate accuracy of the model at the end of each epoch.\n",
+    "\n",
+    "If you want to learn more about Flax and get better understanding of the code below look into [Flax Documentation](https://flax.readthedocs.io/en/latest/)."


Suggested change

"If you want to learn more about Flax and get better understanding of the code below look into [Flax Documentation](https://flax.readthedocs.io/en/latest/)."

"If you want to learn more about Flax and get better understanding of the code below, look into [Flax Documentation](https://flax.readthedocs.io/en/latest/)."

banasraf · 2023-08-07T07:50:17Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "At this point everything is ready to run the training."


Suggested change

"At this point everything is ready to run the training."

"At this point, everything is ready to run the training."

banasraf · 2023-08-07T07:51:54Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "With the setup above DALI iterators are ready for the training. \n",


Suggested change

"With the setup above DALI iterators are ready for the training. \n",

"With the setup above, DALI iterators are ready for the training. \n",

banasraf · 2023-08-07T07:52:27Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

+   "source": [
+    "# Training neural network with DALI and Flax\n",
+    "\n",
+    "This simple example shows how to train a neural network implemented in Flax with DALI pipelines. If you want to learn more about training neural networks with Flax look into [Flax Getting Started](https://flax.readthedocs.io/en/latest/getting_started.html) example.\n",


Suggested change

"This simple example shows how to train a neural network implemented in Flax with DALI pipelines. If you want to learn more about training neural networks with Flax look into [Flax Getting Started](https://flax.readthedocs.io/en/latest/getting_started.html) example.\n",

"This simple example shows how to train a neural network implemented in Flax with DALI pipelines. If you want to learn more about training neural networks with Flax, look into [Flax Getting Started](https://flax.readthedocs.io/en/latest/getting_started.html) example.\n",

banasraf · 2023-08-07T07:53:18Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

+    "\n",
+    " - `output_map`: iterators return a dictionary with outputs of the pipeline as its values. Keys in this dictionary are defined by `output_map`. For example, `labels` output returned from the DALI pipeline defined above will be accessible as `iterator_output['labels']`,\n",
+    " - `reader_name`: setting this parameter introduces the notion of an epoch to our iterator. DALI pipeline itself is infinite, it will return the data indefinately, wrapping around the dataset. DALI readers (such as `fn.readers.caffe2` used in this example) have access to the information about the size of the dataset. If we want to pass this information to the iterator we need to point to the operator that should be queried for the dataset size. We do it by naming the operator (note `name=\"mnist_caffe2_reader\"`) and passing the same name as the value for `reader_name` argument,\n",
+    "  - `auto_reset`: this argument controls the behaviour of the iterator after the end of an epoch. If set to `True` will automatically reset the state of the iterator and prepare it to start the next epoch."


Suggested change

" - `auto_reset`: this argument controls the behaviour of the iterator after the end of an epoch. If set to `True` will automatically reset the state of the iterator and prepare it to start the next epoch."

" - `auto_reset`: this argument controls the behaviour of the iterator after the end of an epoch. If set to `True`, it will automatically reset the state of the iterator and prepare it to start the next epoch."

banasraf · 2023-08-07T07:54:07Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "With utilities defined above we can create an instance of the model we want to train."


Suggested change

"With utilities defined above we can create an instance of the model we want to train."

"With utilities defined above, we can create an instance of the model we want to train."

Signed-off-by: Albert Wolant <awolant@nvidia.com>

awolant · 2023-08-07T08:23:42Z

Some of the comments were also applicable to the basic JAX tutorial since it shares some content with this one. I applied them there as well.

dali-automaton · 2023-08-07T08:35:23Z

CI MESSAGE: [9239937]: BUILD PASSED

szalpal · 2023-08-07T10:32:55Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

@@ -0,0 +1,368 @@
+{


When I clicked the "training example with pure JAX" link, it redirected me to raw output of the jupyter notebook. Is this intentional? Or maybe the reviewing tool fails here?

Reply via ReviewNB

Yes, the review tool can't handle these links. I checked them in the docs build.

szalpal · 2023-08-07T10:32:55Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

@@ -0,0 +1,368 @@
+{


In this section "Getting started" link redirected me to the raw jupyter notebook and "pipeline documentation" to the raw rst file. Could you double-check these?

Reply via ReviewNB

Yes, the review tool can't handle these links. I checked them in the docs build.

szalpal · 2023-08-07T10:32:55Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

@@ -0,0 +1,368 @@
+{


Maybe you could also explain above, why setting seed=0?

Reply via ReviewNB

This is described in the tutorials that are linked just above so I am going to leave it to keep the notebook as lean as possible and only related to JAX+DALI. Hope that's ok?

szalpal · 2023-08-07T10:32:55Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

@@ -0,0 +1,368 @@
+{


Suggestion:
"Next step is to instantiate pipelines and build them." -> "Next step is to instantiate DALI pipelines and build them."

Reply via ReviewNB

szalpal · 2023-08-07T10:32:55Z

docs/examples/frameworks/jax/flax-basic_example.ipynb

@@ -0,0 +1,368 @@
+{


Suggestions:
"For DALI pipeline to work with JAX it needs to be wrapped with appropriate DALI iterator." -> "For DALI pipeline to work with JAX, the former needs to be wrapped with an appropriate DALI iterator."

"In addition to the pipeline we can pass the" -> "In addition to the DALI pipeline object we can pass the"

"Here is a quick explnation of how these parameters work:" - maybe it would be good to put a link to the full documentation somewhere here?

Reply via ReviewNB

szalpal

Please double-check the links, cause they did not work correctly in the ReviewNB (but it might be a problem of the review tool)

Signed-off-by: Albert Wolant <awolant@nvidia.com>

awolant · 2023-08-07T13:45:21Z

!build

dali-automaton · 2023-08-07T13:50:14Z

CI MESSAGE: [9267807]: BUILD STARTED

dali-automaton · 2023-08-07T20:44:26Z

CI MESSAGE: [9267807]: BUILD FAILED

dali-automaton · 2023-08-08T09:21:21Z

CI MESSAGE: [9267807]: BUILD PASSED

New tutorial showing how to train neural network implemented in Flax with DALI. It builds on the JAX training tutorial - significant parts are the same. Signed-off-by: Albert Wolant <awolant@nvidia.com>

awolant added 5 commits August 2, 2023 11:56

Add basic jax.Sharding support

ec6b826

Signed-off-by: Albert Wolant <awolant@nvidia.com>

Add docs

0b94804

Signed-off-by: Albert Wolant <awolant@nvidia.com>

Refactoring

db5128f

Signed-off-by: Albert Wolant <awolant@nvidia.com>

Merge remote-tracking branch 'nvidia/main' into add_flax_example

6b669fc

Add Flax training example

a9e2390

Signed-off-by: Albert Wolant <awolant@nvidia.com>

Remove leftover

64f1df3

Signed-off-by: Albert Wolant <awolant@nvidia.com>

jantonguirao assigned banasraf and szalpal Aug 4, 2023

banasraf approved these changes Aug 7, 2023

View reviewed changes

Fix review comments

9620b26

Signed-off-by: Albert Wolant <awolant@nvidia.com>

szalpal reviewed Aug 7, 2023

View reviewed changes

szalpal approved these changes Aug 7, 2023

View reviewed changes

awolant added 3 commits August 7, 2023 15:34

Fix review comments

8937c10

Signed-off-by: Albert Wolant <awolant@nvidia.com>

Fix review comments

db4ef6f

Signed-off-by: Albert Wolant <awolant@nvidia.com>

Fix review comments

1aecabf

Signed-off-by: Albert Wolant <awolant@nvidia.com>

awolant merged commit 7539566 into NVIDIA:main Aug 8, 2023
5 checks passed

	"DALI setup is very similar to the [training example with pure JAX](jax-basic_example.ipynb). The only difference is the addition of a trailing dimention to the returned image to make it compatible with Flax convolutions. If you are familiar with how to use DALI with JAX you can skip this part and move to the training section of this notebook.\n",
	"DALI setup is very similar to the [training example with pure JAX](jax-basic_example.ipynb). The only difference is the addition of a trailing dimension to the returned image to make it compatible with Flax convolutions. If you are familiar with how to use DALI with JAX, you can skip this part and move to the training section of this notebook.\n",

	"If you want to learn more about Flax and get better understanding of the code below look into [Flax Documentation](https://flax.readthedocs.io/en/latest/)."
	"If you want to learn more about Flax and get better understanding of the code below, look into [Flax Documentation](https://flax.readthedocs.io/en/latest/)."

	"At this point everything is ready to run the training."
	"At this point, everything is ready to run the training."

	"With the setup above DALI iterators are ready for the training. \n",
	"With the setup above, DALI iterators are ready for the training. \n",

	"This simple example shows how to train a neural network implemented in Flax with DALI pipelines. If you want to learn more about training neural networks with Flax look into [Flax Getting Started](https://flax.readthedocs.io/en/latest/getting_started.html) example.\n",
	"This simple example shows how to train a neural network implemented in Flax with DALI pipelines. If you want to learn more about training neural networks with Flax, look into [Flax Getting Started](https://flax.readthedocs.io/en/latest/getting_started.html) example.\n",

	" - `auto_reset`: this argument controls the behaviour of the iterator after the end of an epoch. If set to `True` will automatically reset the state of the iterator and prepare it to start the next epoch."
	" - `auto_reset`: this argument controls the behaviour of the iterator after the end of an epoch. If set to `True`, it will automatically reset the state of the iterator and prepare it to start the next epoch."

	"With utilities defined above we can create an instance of the model we want to train."
	"With utilities defined above, we can create an instance of the model we want to train."

Add Flax training example #4978

Add Flax training example #4978

Conversation

awolant commented Aug 4, 2023 • edited Loading

Category:

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

review-notebook-app bot commented Aug 4, 2023

awolant commented Aug 4, 2023

dali-automaton commented Aug 4, 2023

dali-automaton commented Aug 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

awolant commented Aug 7, 2023

dali-automaton commented Aug 7, 2023

szalpal Aug 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

szalpal Aug 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

szalpal Aug 7, 2023 • edited Loading

Choose a reason for hiding this comment

awolant Aug 7, 2023 • edited Loading

Choose a reason for hiding this comment

szalpal Aug 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

szalpal Aug 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

szalpal left a comment

Choose a reason for hiding this comment

awolant commented Aug 7, 2023

dali-automaton commented Aug 7, 2023

dali-automaton commented Aug 7, 2023

dali-automaton commented Aug 8, 2023

awolant commented Aug 4, 2023 •

edited

Loading

szalpal Aug 7, 2023 •

edited

Loading

szalpal Aug 7, 2023 •

edited

Loading

szalpal Aug 7, 2023 •

edited

Loading

awolant Aug 7, 2023 •

edited

Loading

szalpal Aug 7, 2023 •

edited

Loading

szalpal Aug 7, 2023 •

edited

Loading