Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Revise docs (change PackGenInputs and GenDataSample to mmediting ones) #1382

Merged
merged 2 commits into from
Nov 2, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 5 additions & 12 deletions docs/en/advanced_guides/3_transforms.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,10 @@ The input and output types of transformations are both dict.
dict_keys(['pair_path', 'pair', 'pair_ori_shape', 'img_mask', 'img_photo', 'img_mask_path', 'img_photo_path', 'img_mask_ori_shape', 'img_photo_ori_shape'])
```

Generally, the last step of the transforms pipeline must be `PackGenInputs`.
`PackGenInputs` will pack the processed data into a dict containing two fields: `inputs` and `data_samples`.
Generally, the last step of the transforms pipeline must be `PackEditInputs`.
`PackEditInputs` will pack the processed data into a dict containing two fields: `inputs` and `data_samples`.
`inputs` is the variable you want to use as the model's input, which can be the type of `torch.Tensor`, dict of `torch.Tensor`, or any type you want.
`data_samples` is a list of `GenDataSample`. Each `GenDataSample` contains groundtruth and necessary information for corresponding input.
`data_samples` is a list of `EditDataSample`. Each `EditDataSample` contains groundtruth and necessary information for corresponding input.

### An example of BasicVSR

Expand Down Expand Up @@ -121,15 +121,8 @@ pipeline = [
keys=[f'img_{domain_a}', f'img_{domain_b}'],
direction='horizontal'),
dict(
type='PackGenInputs',
keys=[f'img_{domain_a}', f'img_{domain_b}', 'pair'],
meta_keys=[
'pair_path', 'sample_idx', 'pair_ori_shape',
f'img_{domain_a}_path', f'img_{domain_b}_path',
f'img_{domain_a}_ori_shape', f'img_{domain_b}_ori_shape', 'flip',
'flip_direction'
])
]
type='PackEditInputs',
keys=[f'img_{domain_a}', f'img_{domain_b}', 'pair'])
```

## Supported transforms in MMEditing
Expand Down
4 changes: 2 additions & 2 deletions docs/en/user_guides/1_config.md
Original file line number Diff line number Diff line change
Expand Up @@ -331,11 +331,11 @@ data_root = './data/ffhq/' # Root path of data
train_pipeline = [ # Training data process pipeline
dict(type='LoadImageFromFile', key='img'), # First pipeline to load images from file path
dict(type='Flip', keys=['img'], direction='horizontal'), # Argumentation pipeline that flip the images
dict(type='PackGenInputs', keys=['img'], meta_keys=['img_path']) # The last pipeline that formats the annotation data (if have) and decides which keys in the data should be packed into data_samples
dict(type='PackEditInputs', keys=['img']) # The last pipeline that formats the annotation data (if have) and decides which keys in the data should be packed into data_samples
]
val_pipeline = [
dict(type='LoadImageFromFile', key='img'), # First pipeline to load images from file path
dict(type='PackGenInputs', keys=['img'], meta_keys=['img_path']) # The last pipeline that formats the annotation data (if have) and decides which keys in the data should be packed into data_samples
dict(type='PackEditInputs', keys=['img']) # The last pipeline that formats the annotation data (if have) and decides which keys in the data should be packed into data_samples
]
train_dataloader = dict( # The config of train dataloader
batch_size=4, # Batch size of a single GPU
Expand Down
2 changes: 1 addition & 1 deletion docs/en/user_guides/4_train_test.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ val_dataloader = dict(
pipeline=[
dict(type='LoadImageFromFile', key='img'),
dict(type='Resize', scale=(64, 64)),
dict(type='PackGenInputs', meta_keys=[])
dict(type='PackEditInputs')
]),
sampler=dict(type='DefaultSampler', shuffle=False),
persistent_workers=True)
Expand Down
12 changes: 5 additions & 7 deletions tools/dataset_converters/image_translation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ test_dataloader = dict(
```

Here, we adopt `LoadPairedImageFromFile` to load a paired image as the common loader does and crops
it into two images with the same shape in different domains. As shown in the example, `pipeline` provides important data pipeline to process images, including loading from file system, resizing, cropping, flipping, transferring to `torch.Tensor` and packing to `GenDataSample`. All of supported data pipelines can be found in `mmedit/datasets/transforms`.
it into two images with the same shape in different domains. As shown in the example, `pipeline` provides important data pipeline to process images, including loading from file system, resizing, cropping, flipping, transferring to `torch.Tensor` and packing to `EditDataSample`. All of supported data pipelines can be found in `mmedit/datasets/transforms`.

For unpaired-data trained translation model like CycleGAN , `UnpairedImageDataset` is designed to train such translation models. Here is an example config for horse2zebra dataset:

Expand Down Expand Up @@ -99,17 +99,15 @@ train_pipeline = [
dict(type='Flip', keys=[f'img_{domain_a}'], direction='horizontal'),
dict(type='Flip', keys=[f'img_{domain_b}'], direction='horizontal'),
dict(
type='PackGenInputs',
keys=[f'img_{domain_a}', f'img_{domain_b}'],
meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path'])
type='PackEditInputs',
keys=[f'img_{domain_a}', f'img_{domain_b}'])
]
test_pipeline = [
dict(type='LoadImageFromFile', io_backend='disk', key='img', flag='color'),
dict(type='Resize', scale=(256, 256), interpolation='bicubic'),
dict(
type='PackGenInputs',
keys=[f'img_{domain_a}', f'img_{domain_b}'],
meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path'])
type='PackEditInputs',
keys=[f'img_{domain_a}', f'img_{domain_b}'])
]
data_root = './data/horse2zebra/'
# `batch_size` and `data_root` need to be set.
Expand Down
6 changes: 3 additions & 3 deletions tools/dataset_converters/unconditional_gans/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ dataset_type = 'UnconditionalImageDataset'
train_pipeline = [
dict(type='LoadImageFromFile', key='img'),
dict(type='Flip', keys=['img'], direction='horizontal'),
dict(type='PackGenInputs', keys=['img'], meta_keys=['img_path'])
dict(type='PackEditInputs', keys=['img'], meta_keys=['img_path'])
]

# `batch_size` and `data_root` need to be set.
Expand All @@ -23,7 +23,7 @@ train_dataloader = dict(
pipeline=train_pipeline))
```

Here, we adopt `InfinitySampler` to avoid frequent dataloader reloading, which will accelerate the training procedure. As shown in the example, `pipeline` provides important data pipeline to process images, including loading from file system, resizing, cropping, transferring to `torch.Tensor` and packing to `GenDataSample`. All of supported data pipelines can be found in `mmedit/datasets/transforms`.
Here, we adopt `InfinitySampler` to avoid frequent dataloader reloading, which will accelerate the training procedure. As shown in the example, `pipeline` provides important data pipeline to process images, including loading from file system, resizing, cropping, transferring to `torch.Tensor` and packing to `EditDataSample`. All of supported data pipelines can be found in `mmedit/datasets/transforms`.

For unconditional GANs with dynamic architectures like PGGAN and StyleGANv1, `GrowScaleImgDataset` is recommended to use for training. Since such dynamic architectures need real images in different scales, directly adopting `UnconditionalImageDataset` will bring heavy I/O cost for loading multiple high-resolution images. Here is an example we use for training PGGAN in CelebA-HQ dataset:

Expand All @@ -33,7 +33,7 @@ dataset_type = 'GrowScaleImgDataset'
pipeline = [
dict(type='LoadImageFromFile', key='img'),
dict(type='Flip', keys=['img'], direction='horizontal'),
dict(type='PackGenInputs')
dict(type='PackEditInputs')
]

# `samples_per_gpu` and `imgs_root` need to be set.
Expand Down