-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Flax training example #4978
Conversation
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
CI MESSAGE: [9239937]: BUILD STARTED |
CI MESSAGE: [9239937]: BUILD FAILED |
"\n", | ||
"This simple example shows how to train a neural network implemented in Flax with DALI pipelines. If you want to learn more about training neural networks with Flax look into [Flax Getting Started](https://flax.readthedocs.io/en/latest/getting_started.html) example.\n", | ||
"\n", | ||
"DALI setup is very similar to the [training example with pure JAX](jax-basic_example.ipynb). The only difference is the addition of a trailing dimention to the returned image to make it compatible with Flax convolutions. If you are familiar with how to use DALI with JAX you can skip this part and move to the training section of this notebook.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"DALI setup is very similar to the [training example with pure JAX](jax-basic_example.ipynb). The only difference is the addition of a trailing dimention to the returned image to make it compatible with Flax convolutions. If you are familiar with how to use DALI with JAX you can skip this part and move to the training section of this notebook.\n", | |
"DALI setup is very similar to the [training example with pure JAX](jax-basic_example.ipynb). The only difference is the addition of a trailing dimension to the returned image to make it compatible with Flax convolutions. If you are familiar with how to use DALI with JAX, you can skip this part and move to the training section of this notebook.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"**Here is a quick explnation of how these parameters work:**\n", | ||
"\n", | ||
" - `output_map`: iterators return a dictionary with outputs of the pipeline as its values. Keys in this dictionary are defined by `output_map`. For example, `labels` output returned from the DALI pipeline defined above will be accessible as `iterator_output['labels']`,\n", | ||
" - `reader_name`: setting this parameter introduces the notion of an epoch to our iterator. DALI pipeline itself is infinite, it will return the data indefinately, wrapping around the dataset. DALI readers (such as `fn.readers.caffe2` used in this example) have access to the information about the size of the dataset. If we want to pass this information to the iterator we need to point to the operator that should be queried for the dataset size. We do it by naming the operator (note `name=\"mnist_caffe2_reader\"`) and passing the same name as the value for `reader_name` argument,\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
" - `reader_name`: setting this parameter introduces the notion of an epoch to our iterator. DALI pipeline itself is infinite, it will return the data indefinately, wrapping around the dataset. DALI readers (such as `fn.readers.caffe2` used in this example) have access to the information about the size of the dataset. If we want to pass this information to the iterator we need to point to the operator that should be queried for the dataset size. We do it by naming the operator (note `name=\"mnist_caffe2_reader\"`) and passing the same name as the value for `reader_name` argument,\n", | |
" - `reader_name`: setting this parameter introduces the notion of an epoch to our iterator. DALI pipeline itself is infinite, it will return the data indefinately, wrapping around the dataset. DALI readers (such as `fn.readers.caffe2` used in this example) have access to the information about the size of the dataset. If we want to pass this information to the iterator, we need to point to the operator that should be queried for the dataset size. We do it by naming the operator (note `name=\"mnist_caffe2_reader\"`) and passing the same name as the value for `reader_name` argument,\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"\n", | ||
"Now we need to setup model and training utilities. The goal of this notebook is not to explain Flax concepts. We want to show how to train models implemented in Flax with DALI as a data loading and preprocessing library. We used standard Flax tools do define simple neural network. We have functions to create an instance of this network, run one training step on it and calculate accuracy of the model at the end of each epoch.\n", | ||
"\n", | ||
"If you want to learn more about Flax and get better understanding of the code below look into [Flax Documentation](https://flax.readthedocs.io/en/latest/)." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If you want to learn more about Flax and get better understanding of the code below look into [Flax Documentation](https://flax.readthedocs.io/en/latest/)." | |
"If you want to learn more about Flax and get better understanding of the code below, look into [Flax Documentation](https://flax.readthedocs.io/en/latest/)." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"At this point everything is ready to run the training." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"At this point everything is ready to run the training." | |
"At this point, everything is ready to run the training." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"With the setup above DALI iterators are ready for the training. \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"With the setup above DALI iterators are ready for the training. \n", | |
"With the setup above, DALI iterators are ready for the training. \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"source": [ | ||
"# Training neural network with DALI and Flax\n", | ||
"\n", | ||
"This simple example shows how to train a neural network implemented in Flax with DALI pipelines. If you want to learn more about training neural networks with Flax look into [Flax Getting Started](https://flax.readthedocs.io/en/latest/getting_started.html) example.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This simple example shows how to train a neural network implemented in Flax with DALI pipelines. If you want to learn more about training neural networks with Flax look into [Flax Getting Started](https://flax.readthedocs.io/en/latest/getting_started.html) example.\n", | |
"This simple example shows how to train a neural network implemented in Flax with DALI pipelines. If you want to learn more about training neural networks with Flax, look into [Flax Getting Started](https://flax.readthedocs.io/en/latest/getting_started.html) example.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"\n", | ||
" - `output_map`: iterators return a dictionary with outputs of the pipeline as its values. Keys in this dictionary are defined by `output_map`. For example, `labels` output returned from the DALI pipeline defined above will be accessible as `iterator_output['labels']`,\n", | ||
" - `reader_name`: setting this parameter introduces the notion of an epoch to our iterator. DALI pipeline itself is infinite, it will return the data indefinately, wrapping around the dataset. DALI readers (such as `fn.readers.caffe2` used in this example) have access to the information about the size of the dataset. If we want to pass this information to the iterator we need to point to the operator that should be queried for the dataset size. We do it by naming the operator (note `name=\"mnist_caffe2_reader\"`) and passing the same name as the value for `reader_name` argument,\n", | ||
" - `auto_reset`: this argument controls the behaviour of the iterator after the end of an epoch. If set to `True` will automatically reset the state of the iterator and prepare it to start the next epoch." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
" - `auto_reset`: this argument controls the behaviour of the iterator after the end of an epoch. If set to `True` will automatically reset the state of the iterator and prepare it to start the next epoch." | |
" - `auto_reset`: this argument controls the behaviour of the iterator after the end of an epoch. If set to `True`, it will automatically reset the state of the iterator and prepare it to start the next epoch." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"With utilities defined above we can create an instance of the model we want to train." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"With utilities defined above we can create an instance of the model we want to train." | |
"With utilities defined above, we can create an instance of the model we want to train." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Some of the comments were also applicable to the basic JAX tutorial since it shares some content with this one. I applied them there as well. |
CI MESSAGE: [9239937]: BUILD PASSED |
@@ -0,0 +1,368 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I clicked the "training example with pure JAX" link, it redirected me to raw output of the jupyter notebook. Is this intentional? Or maybe the reviewing tool fails here?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the review tool can't handle these links. I checked them in the docs build.
@@ -0,0 +1,368 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this section "Getting started" link redirected me to the raw jupyter notebook and "pipeline documentation" to the raw rst file. Could you double-check these?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the review tool can't handle these links. I checked them in the docs build.
@@ -0,0 +1,368 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is described in the tutorials that are linked just above so I am going to leave it to keep the notebook as lean as possible and only related to JAX+DALI. Hope that's ok?
@@ -0,0 +1,368 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion:
"Next step is to instantiate pipelines and build them." -> "Next step is to instantiate DALI pipelines and build them."
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -0,0 +1,368 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions:
"For DALI pipeline to work with JAX it needs to be wrapped with appropriate DALI iterator." -> "For DALI pipeline to work with JAX, the former needs to be wrapped with an appropriate DALI iterator."
"In addition to the pipeline we can pass the" -> "In addition to the DALI pipeline object we can pass the"
"Here is a quick explnation of how these parameters work:" - maybe it would be good to put a link to the full documentation somewhere here?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please double-check the links, cause they did not work correctly in the ReviewNB (but it might be a problem of the review tool)
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
Signed-off-by: Albert Wolant <awolant@nvidia.com>
!build |
CI MESSAGE: [9267807]: BUILD STARTED |
CI MESSAGE: [9267807]: BUILD FAILED |
CI MESSAGE: [9267807]: BUILD PASSED |
New tutorial showing how to train neural network implemented in Flax with DALI. It builds on the JAX training tutorial - significant parts are the same. Signed-off-by: Albert Wolant <awolant@nvidia.com>
Category:
New feature
Description:
New tutorial showing how to train neural network implemented in Flax with DALI. It builds on the JAX training tutorial - significant parts are the same.
Additional information:
Affected modules and functionalities:
JAX documentation.
Key points relevant for the review:
Is it understandable? Spelling, grammar?
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: DALI-3564