-
Notifications
You must be signed in to change notification settings - Fork 615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust test scripts and section header for webadataset notebook #3162
Conversation
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
!build |
CI MESSAGE: [2610399]: BUILD STARTED |
CI MESSAGE: [2610804]: BUILD STARTED |
CI MESSAGE: [2610399]: BUILD PASSED |
"metadata": {}, | ||
"source": [ | ||
"## Introduction\n", | ||
"### Data Representation\n", | ||
"Web Dataset is a dataset representation that heavily optimizes networked accessed storage performance. At its simplest, it stores the whole dataset in one tarball file, where individual samples are kept under the files with the same names but different extensions. This approach improves drive access caching on the RAM, since the data is represented sequentially.\n", | ||
"Web Dataset is a dataset representation that heavily optimizes networked accessed storage performance. At its simplest, it stores the whole dataset in one tarball file, where individual samples are kept under the files with the same names but different extensions. This approach improves drive access caching on the RAM, since the data is represented sequentially." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note being very well versed in WebDataset, I find this a bit unclear. Does it mean this?
"Web Dataset is a dataset representation that heavily optimizes networked accessed storage performance. At its simplest, it stores the whole dataset in one tarball file, where individual samples are kept under the files with the same names but different extensions. This approach improves drive access caching on the RAM, since the data is represented sequentially." | |
"Web Dataset is a dataset representation that heavily optimizes networked accessed storage performance. At its simplest, it stores the whole dataset in one tarball file, where each sample is represented by one or more entries with the same same name but different extensions. This approach improves drive access caching in RAM, since the data is represented sequentially." |
CI MESSAGE: [2610804]: BUILD PASSED |
Signed-off-by: Krzysztof Lecki klecki@nvidia.com
Description
What happened in this PR
Fix CI scripts.
Move
### Sharding
section header in the notebookto separate cell as it didn't render correctly
in the docs.
Additional information
CI scripts
Docs
Dependencies in CI scripts?
Checklist
Tests
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A