Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more AutoAugment policies #4753

Merged
merged 10 commits into from
Apr 6, 2023
Merged

Add more AutoAugment policies #4753

merged 10 commits into from
Apr 6, 2023

Conversation

stiepan
Copy link
Member

@stiepan stiepan commented Mar 29, 2023

Category:

New feature (non-breaking change which adds functionality)

Description:

This PR adds 3 more (apart from the "v0" image-net policy) policies to auto_augment module (reduced image net, reduced cifar-10, svhn). A new function, simply called auto_augment, is added to the auto_augment module, as a convienience wrapper for applying AA with one of the predefined policies.

The predifined policies as introduced in the AA paper are a bit over-specified: 1. they specify meanigless magnitude bins for augmentations that do not accept magnitudes, 2. they specify some augmentations to be run with 0 probability, 3. they specify some augmentations to be run with magnitude such that the operation is in fact an identity. This PR adds warnings and adjusts the definition of policies to address the points 1. and 2.

Now, all AA/TA and RA modules use both translation_x and translation_y augmentations, so I removed the _get_translation_y helper from auto_aug and moved the _get_translations from RA to common util used across the three modules.

Additonally, the PR fills some gaps in the documentation (i.e. max_translation_abs/rel) and the docs in utilities.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: DALI-3299

@stiepan stiepan added the automatic augmentations Automatic augmentations (AutoAugment, RandAugment, TrivialAugment and more) support in DALI. label Mar 29, 2023
Comment on lines 94 to 143
@params(*tuple(itertools.product((True, False), (0, 1), ('height', 'width', 'both'))))
def test_translation(use_shape, offset_fraction, extent):
# make sure the translation helper processes the args properly
# note, it only uses translate_y (as it is in imagenet policy)
shape = [300, 400]
fill_value = 217
params = {}
if use_shape:
param = offset_fraction
param_name = "max_translate_rel"
else:
param_name = "max_translate_abs"
if extent == 'both':
param = shape[0] * offset_fraction
elif extent == 'height':
param = [shape[0] * offset_fraction, 0]
elif extent == 'width':
param = [0, shape[1] * offset_fraction]
else:
assert False, f"Unrecognized extent={extent}"
params[param_name] = param
translate_y = auto_augment._get_translate_y(use_shape=use_shape, **params)
policy = Policy(f"Policy_{use_shape}_{offset_fraction}", num_magnitude_bins=21,
sub_policies=[[(translate_y, 1, 20)]])

@experimental.pipeline_def(enable_conditionals=True, batch_size=3, num_threads=4, device_id=0,
seed=43)
def pipeline():
encoded_image, _ = fn.readers.file(name="Reader", file_root=images_dir)
image = fn.decoders.image(encoded_image, device="mixed")
image = fn.resize(image, size=shape)
if use_shape:
return auto_augment.apply_auto_augment(policy, image, fill_value=fill_value,
shape=shape)
else:
return auto_augment.apply_auto_augment(policy, image, fill_value=fill_value)

p = pipeline()
p.build()
output, = p.run()
output = [np.array(sample) for sample in output.as_cpu()]
for i, sample in enumerate(output):
sample = np.array(sample)
if offset_fraction == 1 and extent != "width":
assert np.all(sample == fill_value), f"sample_idx: {i}"
else:
background_count = np.sum(sample == fill_value)
assert background_count / sample.size < 0.1, \
f"sample_idx: {i}, {background_count / sample.size}"


Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now common utility for AA/TA/RA modules. Tested in test_augmentations.

Comment on lines 69 to 122
@params(*tuple(itertools.product((True, False), (0, 1), ('height', 'width', 'both'))))
def test_translation(use_shape, offset_fraction, extent):
# make sure the translation helper processes the args properly
# note, it only uses translate_y (as it is in imagenet policy)
shape = [300, 400]
fill_value = 105
params = {}
if use_shape:
param = offset_fraction
param_name = "max_translate_rel"
else:
param_name = "max_translate_abs"
assert extent in ('height', 'width', 'both'), f"{extent}"
if extent == 'both':
param = [shape[0] * offset_fraction, shape[1] * offset_fraction]
elif extent == 'height':
param = [shape[0] * offset_fraction, 0]
elif extent == 'width':
param = [0, shape[1] * offset_fraction]
params[param_name] = param
translate_x, translate_y = rand_augment._get_translations(use_shape=use_shape, **params)
if extent == 'both':
augments = [translate_x, translate_y]
elif extent == 'height':
augments = [translate_y]
elif extent == 'width':
augments = [translate_x]

@experimental.pipeline_def(enable_conditionals=True, batch_size=3, num_threads=4, device_id=0,
seed=43)
def pipeline():
encoded_image, _ = fn.readers.file(name="Reader", file_root=images_dir)
image = fn.decoders.image(encoded_image, device="mixed")
image = fn.resize(image, size=shape)
if use_shape:
return rand_augment.apply_rand_augment(augments, image, n=1, m=30,
fill_value=fill_value, shape=shape)
else:
return rand_augment.apply_rand_augment(augments, image, n=1, m=30,
fill_value=fill_value)

p = pipeline()
p.build()
output, = p.run()
output = [np.array(sample) for sample in output.as_cpu()]
for i, sample in enumerate(output):
sample = np.array(sample)
if offset_fraction == 1:
assert np.all(sample == fill_value), f"sample_idx: {i}"
else:
background_count = np.sum(sample == fill_value)
assert background_count / sample.size < 0.1, \
f"sample_idx: {i}, {background_count / sample.size}"


Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now common utility for AA/TA/RA modules. Tested in test_augmentations.

Comment on lines 66 to 112
@params(*tuple(itertools.product((True, False), (0, 1), ('x', 'y'))))
def test_translation(use_shape, offset_fraction, extent):
# make sure the translation helper processes the args properly
# note, it only uses translate_y (as it is in imagenet policy)
fill_value = 0
params = {}
if use_shape:
param = offset_fraction
param_name = "max_translate_rel"
else:
param = 1000 * offset_fraction
param_name = "max_translate_abs"
params[param_name] = param
translation_x, translation_y = trivial_augment._get_translations(use_shape=use_shape, **params)
augment = [translation_x] if extent == 'x' else [translation_y]

@experimental.pipeline_def(enable_conditionals=True, batch_size=9, num_threads=4, device_id=0,
seed=43)
def pipeline():
encoded_image, _ = fn.readers.file(name="Reader", file_root=images_dir)
image = fn.decoders.image(encoded_image, device="mixed")
if use_shape:
shape = fn.peek_image_shape(encoded_image)
return trivial_augment.apply_trivial_augment(augment, image, num_magnitude_bins=3,
fill_value=fill_value, shape=shape)
else:
return trivial_augment.apply_trivial_augment(augment, image, num_magnitude_bins=3,
fill_value=fill_value)

p = pipeline()
p.build()
output, = p.run()
output = [np.array(sample) for sample in output.as_cpu()]
if offset_fraction == 1:
# magnitudes are random here, but some should randomly be maximal
all_black = 0
for i, sample in enumerate(output):
sample = np.array(sample)
all_black += np.all(sample == fill_value)
assert all_black
else:
for i, sample in enumerate(output):
sample = np.array(sample)
background_count = np.sum(sample == fill_value)
assert background_count / sample.size < 0.1, \
f"sample_idx: {i}, {background_count / sample.size}"


Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now common utility for AA/TA/RA modules. Tested in test_augmentations.

Comment on lines -242 to -257
def _get_translations(use_shape: bool = False, max_translate_abs: Optional[int] = None,
max_translate_rel: Optional[float] = None) -> List[_Augmentation]:
max_translate_height, max_translate_width = _parse_validate_offset(
use_shape, max_translate_abs=max_translate_abs, max_translate_rel=max_translate_rel,
default_translate_abs=100, default_translate_rel=100 / 224)
if use_shape:
return [
a.translate_x.augmentation((0, max_translate_width), True),
a.translate_y.augmentation((0, max_translate_height), True),
]
else:
return [
a.translate_x_no_shape.augmentation((0, max_translate_width), True),
a.translate_y_no_shape.augmentation((0, max_translate_height), True),
]
Copy link
Member Author

@stiepan stiepan Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now common utility for AA/TA/RA modules. Defined in core._utils.

Comment on lines -179 to -195


def _get_translations(use_shape: bool = False, max_translate_abs: Optional[int] = None,
max_translate_rel: Optional[float] = None) -> List[_Augmentation]:
max_translate_height, max_translate_width = _parse_validate_offset(
use_shape, max_translate_abs=max_translate_abs, max_translate_rel=max_translate_rel,
default_translate_abs=32, default_translate_rel=1.)
if use_shape:
return [
a.translate_x.augmentation((0, max_translate_width), True),
a.translate_y.augmentation((0, max_translate_height), True),
]
else:
return [
a.translate_x_no_shape.augmentation((0, max_translate_width), True),
a.translate_y_no_shape.augmentation((0, max_translate_height), True),
]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now common utility for AA/TA/RA modules. Defined in core._utils.

@stiepan
Copy link
Member Author

stiepan commented Mar 30, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7763263]: BUILD STARTED

param_shape = shape
param_name = "max_translate_abs"
if extent == 'both':
param = [param_shape[0] * offset_fraction, param_shape[1] * offset_fraction]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick:

Suggested change
param = [param_shape[0] * offset_fraction, param_shape[1] * offset_fraction]
param = offset_fraction * param_shape[:2]

or

Suggested change
param = [param_shape[0] * offset_fraction, param_shape[1] * offset_fraction]
param = offset_fraction * param_shape

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if that helps - the * works as composition of concatentation not by broadcasting, so I would either get [] or the same list.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7763263]: BUILD PASSED

@stiepan
Copy link
Member Author

stiepan commented Apr 3, 2023

Rebased onto #4747

@stiepan
Copy link
Member Author

stiepan commented Apr 3, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7802387]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7802387]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7802387]: BUILD PASSED

@stiepan stiepan mentioned this pull request Apr 4, 2023
18 tasks
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
…always skipped augmentations

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@stiepan
Copy link
Member Author

stiepan commented Apr 4, 2023

Rebased onto #4751

@stiepan
Copy link
Member Author

stiepan commented Apr 4, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7815611]: BUILD STARTED

augmentations. If tuple is specified, the first component limits height, the second the
width.
augmentations. If a tuple is specified, the first component limits height, the second the
width. Defaults to 250.
max_translate_rel: float or (float, float), optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tuple is not specified in the type annotation, but I am not sure if those are not an overkill. The docs rendering of such long annotations is a bit problematic.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7815611]: BUILD FAILED

@stiepan
Copy link
Member Author

stiepan commented Apr 6, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7836392]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7836392]: BUILD PASSED

@stiepan stiepan merged commit ed89a2a into NVIDIA:main Apr 6, 2023
@JanuszL JanuszL mentioned this pull request Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automatic augmentations Automatic augmentations (AutoAugment, RandAugment, TrivialAugment and more) support in DALI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants