Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a native dataloader to RN50 PyTorch example #4807

Merged
merged 5 commits into from
Apr 27, 2023
Merged

Conversation

JanuszL
Copy link
Contributor

@JanuszL JanuszL commented Apr 25, 2023

Category:

Other (e.g. Documentation, Tests, Configuration)

Description:

Additional information:

Affected modules and functionalities:

  • DALI RN50 PyTorch example
  • qa/TL1_jupyter_plugins/test_pytorch.sh

Key points relevant for the review:

  • NA

Tests:

  • Existing tests apply
    • TL3_RN50_convergence
    • TL1_jupyter_plugins
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@JanuszL
Copy link
Contributor Author

JanuszL commented Apr 25, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [8078512]: BUILD STARTED

@stiepan stiepan self-assigned this Apr 25, 2023
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [8078512]: BUILD PASSED

@JanuszL JanuszL changed the title Add a native dataloader to RN50 PyOtrch exmple Add a native dataloader to RN50 PyTorch example Apr 25, 2023
- extends DALI RN50 PyOtrch example based on
  https://github.com/NVIDIA/apex/tree/master/examples/imagenet
  by adding a native dataloader and easy ability to switch
  between DALI on and DALI off

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
def __iter__(self):
return self

def __next__(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this on top of the original APEX example to make it more aligned with DALI iterator.

@stiepan
Copy link
Member

stiepan commented Apr 26, 2023

Btw. Do we run the baseline anyhow in CI?

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL
Copy link
Contributor Author

JanuszL commented Apr 26, 2023

Btw. Do we run the baseline anyhow in CI?

Added a basic run (to check if the script works).

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL
Copy link
Contributor Author

JanuszL commented Apr 26, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [8089640]: BUILD STARTED

Copy link
Contributor

@klecki klecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few small nitpicks, I usually don't do that, but there are formatting differences between neighbouring lines of code.

Comment on lines 291 to 296
pipe.build()
train_loader = DALIClassificationIterator(pipe, reader_name="Reader",
last_batch_policy=LastBatchPolicy.PARTIAL,
auto_reset =True)

pipe = create_dali_pipeline(batch_size=args.batch_size,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, my eyes hurt due to overwriting of the pipeline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 331 to 340
train_loader = torch.utils.data.DataLoader(
train_dataset, batch_size=args.batch_size, shuffle=(train_sampler is None),
num_workers=args.workers, pin_memory=True, sampler=train_sampler, collate_fn=collate_fn)

val_loader = torch.utils.data.DataLoader(
val_dataset,
batch_size=args.batch_size, shuffle=False,
num_workers=args.workers, pin_memory=True,
sampler=val_sampler,
collate_fn=collate_fn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you format it in the same way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 312 to 321
train_dataset = datasets.ImageFolder(
traindir,
transforms.Compose([
transforms.RandomResizedCrop(crop_size),
transforms.RandomHorizontalFlip(),
]))
val_dataset = datasets.ImageFolder(valdir, transforms.Compose([
transforms.Resize(val_size),
transforms.CenterCrop(crop_size),
]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you format it in the same way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 88 to 89
parser.add_argument('--disable_dali', default=False, action='store_true',
help='If use DALI at all')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
parser.add_argument('--disable_dali', default=False, action='store_true',
help='If use DALI at all')
parser.add_argument('--disable_dali', default=False, action='store_true',
help='Disable DALI data loader and use native PyTorch one instead.')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [8089640]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [8089640]: BUILD PASSED

@JanuszL JanuszL merged commit 7b79cdb into NVIDIA:main Apr 27, 2023
@JanuszL JanuszL deleted the ref_rn50 branch April 27, 2023 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants