diff --git a/docs/en/advanced_guides/3_transforms.md b/docs/en/advanced_guides/3_transforms.md index 4b1fcf9617..565a3499ab 100644 --- a/docs/en/advanced_guides/3_transforms.md +++ b/docs/en/advanced_guides/3_transforms.md @@ -45,10 +45,10 @@ The input and output types of transformations are both dict. dict_keys(['pair_path', 'pair', 'pair_ori_shape', 'img_mask', 'img_photo', 'img_mask_path', 'img_photo_path', 'img_mask_ori_shape', 'img_photo_ori_shape']) ``` -Generally, the last step of the transforms pipeline must be `PackGenInputs`. -`PackGenInputs` will pack the processed data into a dict containing two fields: `inputs` and `data_samples`. +Generally, the last step of the transforms pipeline must be `PackEditInputs`. +`PackEditInputs` will pack the processed data into a dict containing two fields: `inputs` and `data_samples`. `inputs` is the variable you want to use as the model's input, which can be the type of `torch.Tensor`, dict of `torch.Tensor`, or any type you want. -`data_samples` is a list of `GenDataSample`. Each `GenDataSample` contains groundtruth and necessary information for corresponding input. +`data_samples` is a list of `EditDataSample`. Each `EditDataSample` contains groundtruth and necessary information for corresponding input. ### An example of BasicVSR @@ -121,15 +121,8 @@ pipeline = [ keys=[f'img_{domain_a}', f'img_{domain_b}'], direction='horizontal'), dict( - type='PackGenInputs', - keys=[f'img_{domain_a}', f'img_{domain_b}', 'pair'], - meta_keys=[ - 'pair_path', 'sample_idx', 'pair_ori_shape', - f'img_{domain_a}_path', f'img_{domain_b}_path', - f'img_{domain_a}_ori_shape', f'img_{domain_b}_ori_shape', 'flip', - 'flip_direction' - ]) -] + type='PackEditInputs', + keys=[f'img_{domain_a}', f'img_{domain_b}', 'pair']) ``` ## Supported transforms in MMEditing diff --git a/docs/en/user_guides/1_config.md b/docs/en/user_guides/1_config.md index b7ffbbae1a..9bf3cbbdcd 100644 --- a/docs/en/user_guides/1_config.md +++ b/docs/en/user_guides/1_config.md @@ -331,11 +331,11 @@ data_root = './data/ffhq/' # Root path of data train_pipeline = [ # Training data process pipeline dict(type='LoadImageFromFile', key='img'), # First pipeline to load images from file path dict(type='Flip', keys=['img'], direction='horizontal'), # Argumentation pipeline that flip the images - dict(type='PackGenInputs', keys=['img'], meta_keys=['img_path']) # The last pipeline that formats the annotation data (if have) and decides which keys in the data should be packed into data_samples + dict(type='PackEditInputs', keys=['img']) # The last pipeline that formats the annotation data (if have) and decides which keys in the data should be packed into data_samples ] val_pipeline = [ dict(type='LoadImageFromFile', key='img'), # First pipeline to load images from file path - dict(type='PackGenInputs', keys=['img'], meta_keys=['img_path']) # The last pipeline that formats the annotation data (if have) and decides which keys in the data should be packed into data_samples + dict(type='PackEditInputs', keys=['img']) # The last pipeline that formats the annotation data (if have) and decides which keys in the data should be packed into data_samples ] train_dataloader = dict( # The config of train dataloader batch_size=4, # Batch size of a single GPU diff --git a/docs/en/user_guides/4_train_test.md b/docs/en/user_guides/4_train_test.md index c48b6e94ba..d6595111ff 100644 --- a/docs/en/user_guides/4_train_test.md +++ b/docs/en/user_guides/4_train_test.md @@ -196,7 +196,7 @@ val_dataloader = dict( pipeline=[ dict(type='LoadImageFromFile', key='img'), dict(type='Resize', scale=(64, 64)), - dict(type='PackGenInputs', meta_keys=[]) + dict(type='PackEditInputs') ]), sampler=dict(type='DefaultSampler', shuffle=False), persistent_workers=True) diff --git a/tools/dataset_converters/image_translation/README.md b/tools/dataset_converters/image_translation/README.md index 85cec47c16..51e5312cdd 100644 --- a/tools/dataset_converters/image_translation/README.md +++ b/tools/dataset_converters/image_translation/README.md @@ -67,7 +67,7 @@ test_dataloader = dict( ``` Here, we adopt `LoadPairedImageFromFile` to load a paired image as the common loader does and crops -it into two images with the same shape in different domains. As shown in the example, `pipeline` provides important data pipeline to process images, including loading from file system, resizing, cropping, flipping, transferring to `torch.Tensor` and packing to `GenDataSample`. All of supported data pipelines can be found in `mmedit/datasets/transforms`. +it into two images with the same shape in different domains. As shown in the example, `pipeline` provides important data pipeline to process images, including loading from file system, resizing, cropping, flipping, transferring to `torch.Tensor` and packing to `EditDataSample`. All of supported data pipelines can be found in `mmedit/datasets/transforms`. For unpaired-data trained translation model like CycleGAN , `UnpairedImageDataset` is designed to train such translation models. Here is an example config for horse2zebra dataset: @@ -99,17 +99,15 @@ train_pipeline = [ dict(type='Flip', keys=[f'img_{domain_a}'], direction='horizontal'), dict(type='Flip', keys=[f'img_{domain_b}'], direction='horizontal'), dict( - type='PackGenInputs', - keys=[f'img_{domain_a}', f'img_{domain_b}'], - meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path']) + type='PackEditInputs', + keys=[f'img_{domain_a}', f'img_{domain_b}']) ] test_pipeline = [ dict(type='LoadImageFromFile', io_backend='disk', key='img', flag='color'), dict(type='Resize', scale=(256, 256), interpolation='bicubic'), dict( - type='PackGenInputs', - keys=[f'img_{domain_a}', f'img_{domain_b}'], - meta_keys=[f'img_{domain_a}_path', f'img_{domain_b}_path']) + type='PackEditInputs', + keys=[f'img_{domain_a}', f'img_{domain_b}']) ] data_root = './data/horse2zebra/' # `batch_size` and `data_root` need to be set. diff --git a/tools/dataset_converters/unconditional_gans/README.md b/tools/dataset_converters/unconditional_gans/README.md index 70d5524f4f..a15b6204de 100644 --- a/tools/dataset_converters/unconditional_gans/README.md +++ b/tools/dataset_converters/unconditional_gans/README.md @@ -8,7 +8,7 @@ dataset_type = 'UnconditionalImageDataset' train_pipeline = [ dict(type='LoadImageFromFile', key='img'), dict(type='Flip', keys=['img'], direction='horizontal'), - dict(type='PackGenInputs', keys=['img'], meta_keys=['img_path']) + dict(type='PackEditInputs', keys=['img'], meta_keys=['img_path']) ] # `batch_size` and `data_root` need to be set. @@ -23,7 +23,7 @@ train_dataloader = dict( pipeline=train_pipeline)) ``` -Here, we adopt `InfinitySampler` to avoid frequent dataloader reloading, which will accelerate the training procedure. As shown in the example, `pipeline` provides important data pipeline to process images, including loading from file system, resizing, cropping, transferring to `torch.Tensor` and packing to `GenDataSample`. All of supported data pipelines can be found in `mmedit/datasets/transforms`. +Here, we adopt `InfinitySampler` to avoid frequent dataloader reloading, which will accelerate the training procedure. As shown in the example, `pipeline` provides important data pipeline to process images, including loading from file system, resizing, cropping, transferring to `torch.Tensor` and packing to `EditDataSample`. All of supported data pipelines can be found in `mmedit/datasets/transforms`. For unconditional GANs with dynamic architectures like PGGAN and StyleGANv1, `GrowScaleImgDataset` is recommended to use for training. Since such dynamic architectures need real images in different scales, directly adopting `UnconditionalImageDataset` will bring heavy I/O cost for loading multiple high-resolution images. Here is an example we use for training PGGAN in CelebA-HQ dataset: @@ -33,7 +33,7 @@ dataset_type = 'GrowScaleImgDataset' pipeline = [ dict(type='LoadImageFromFile', key='img'), dict(type='Flip', keys=['img'], direction='horizontal'), - dict(type='PackGenInputs') + dict(type='PackEditInputs') ] # `samples_per_gpu` and `imgs_root` need to be set.