Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank renders #21

Closed
kristijanbartol opened this issue Jun 14, 2022 · 14 comments
Closed

Blank renders #21

kristijanbartol opened this issue Jun 14, 2022 · 14 comments
Assignees
Labels
bug Something isn't working

Comments

@kristijanbartol
Copy link
Owner

Introduction

Oh, well, here we have it, the version issues... I already came across an issue when I first installed HierProb3D and solved by selecting very narrow combination of PyTorch3d/PyTorch/CUDA versions. In particular, the solution was to use Pytorch3D==0.3.0.

Problem description

The problem is hidden in training procedure, when you render the images. The result is a black image if you use PyTorch3D>=0.3.0. This generates an error on the line 304 in utils/image_utils.py:

bbox_corners[i, :2], _ = torch.min(body_pixels, dim=0)  # Top left

where the bbox is not extracted because the image is blank.

Previous solution

Something that certainly works is the original combination of versions (original Dockerfile), but the problem is that then Julien can't use his GPU due to incompatibility of PyTorch/GPU.

New solution 1

We have to find the new combination of versions that are both supported by Julien's GPU and also use old enough PyTorch3D (<0.5.0).

New solution 2

We have to modify the code so that we are able to produce the rendering using newer PyTorch3D.

@jufi2112 could you take this issue in the next two weeks, as I'm afraid it will take some time to figure out (I already struggled quite a bit originally)?

@kristijanbartol kristijanbartol added the bug Something isn't working label Jun 14, 2022
@kristijanbartol kristijanbartol changed the title Invalid PyTorch3D version Blank renders Jun 14, 2022
@jufi2112
Copy link
Collaborator

will do

@jufi2112 jufi2112 self-assigned this Jun 14, 2022
@jufi2112
Copy link
Collaborator

jufi2112 commented Jun 15, 2022

Btw @kristijanbartol is this related to the absence of predicted meshes during inference using HierProb3D?

@jufi2112
Copy link
Collaborator

jufi2112 commented Jun 15, 2022

Traceback:

root@8033fb12ca5a:/garmentor# python run_train.py -E experiments/exp_001_julien

Device: cuda:0

Saving model checkpoints to: experiments/exp_001_julien/saved_models
Saving logs to: experiments/exp_001_julien/log.pkl
Saving config to: experiments/exp_001_julien/pose_shape_cfg.yaml

WARNING: experiments/exp_001_julien already exists - may be overwriting previous experiments!

 DATA:
  BBOX_SCALE_FACTOR: 1.2
  BBOX_THRESHOLD: 0.95
  EDGE_GAUSSIAN_SIZE: 5
  EDGE_GAUSSIAN_STD: 1.0
  EDGE_NMS: True
  EDGE_THRESHOLD: 0.0
  HEATMAP_GAUSSIAN_STD: 4.0
  PROXY_REP_SIZE: 256
LOSS:
  NUM_SAMPLES: 8
  SAMPLE_ON_CPU: True
  STAGE1:
    J2D_LOSS_ON: means
    MF_OVERREG: 1.005
    REDUCTION: mean
    WEIGHTS:
      GLOB_ROTMATS: 5000.0
      JOINTS2D: 5000.0
      JOINTS3D: 0.0
      POSE: 80.0
      SHAPE: 50.0
      VERTS3D: 0.0
  STAGE2:
    J2D_LOSS_ON: means+samples
    MF_OVERREG: 1.005
    REDUCTION: mean
    WEIGHTS:
      GLOB_ROTMATS: 5000.0
      JOINTS2D: 30000.0
      JOINTS3D: 5000.0
      POSE: 10.0
      SHAPE: 80.0
      VERTS3D: 5000.0
  STAGE_CHANGE_EPOCH: 66
MODEL:
  DELTA_I: True
  DELTA_I_WEIGHT: 1.0
  EMBED_DIM: 256
  NUM_IN_CHANNELS: 18
  NUM_RESNET_LAYERS: 18
  NUM_SMPL_BETAS: 10
TRAIN:
  BATCH_SIZE: 4
  EPOCHS_PER_SAVE: 5
  LR: 0.0001
  NUM_EPOCHS: 300
  NUM_WORKERS: 2
  PIN_MEMORY: True
  SYNTH_DATA:
    AUGMENT:
      BBOX:
        DELTA_CENTRE_RANGE: [-5, 5]
        DELTA_SCALE_RANGE: [-0.3, 0.2]
      CAM:
        DELTA_Z_RANGE: [-0.5, 0.5]
        XY_STD: 0.05
      PROXY_REP:
        DELTA_J2D_DEV_RANGE: [-6, 6]
        EXTREME_CROP_PROB: 0.1
        JOINTS_SWAP_PROB: 0.1
        JOINTS_TO_SWAP: [[5, 6], [11, 12]]
        OCCLUDE_BOTTOM_PROB: 0.02
        OCCLUDE_BOX_DIM: 48
        OCCLUDE_BOX_PROB: 0.1
        OCCLUDE_TOP_PROB: 0.005
        OCCLUDE_VERTICAL_PROB: 0.05
        REMOVE_APPENDAGE_JOINTS_PROB: 0.5
        REMOVE_JOINTS_INDICES: [7, 8, 9, 10, 13, 14, 15, 16]
        REMOVE_JOINTS_PROB: 0.1
        REMOVE_PARTS_CLASSES: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
        REMOVE_PARTS_PROBS: [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.05, 0.05, 0.05, 0.05, 0.1, 0.1, 0.1, 0.1, 0.05, 0.05, 0.05, 0.05, 0.1, 0.1, 0.1, 0.1, 0.05, 0.05]
      RGB:
        LIGHT_AMBIENT_RANGE: [0.4, 0.8]
        LIGHT_DIFFUSE_RANGE: [0.4, 0.8]
        LIGHT_LOC_RANGE: [0.05, 3.0]
        LIGHT_SPECULAR_RANGE: [0.0, 0.5]
        OCCLUDE_BOTTOM_PROB: 0.02
        OCCLUDE_TOP_PROB: 0.005
        OCCLUDE_VERTICAL_PROB: 0.05
        PIXEL_CHANNEL_NOISE: 0.2
      SMPL:
        SHAPE_STD: 1.25
    CROP_INPUT: True
    FOCAL_LENGTH: 300.0
    MEAN_CAM_T: [0.0, -0.2, 2.5]

Training poses found: 91106
Training textures found (grey, nongrey): 125 792
Training backgrounds found: 99414
Validation poses found: 33347
Validation textures found (grey, nongrey): 32 76
Validation backgrounds found: 3000 


Renderer projection type: perspective

Epoch 0/299
----------
Training.
  0%|                                                                                             | 0/22776 [00:00<?, ?it/s]
bbox_determiner: tensor([[[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]],

        [[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]],

        [[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]],

        [[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]]], device='cuda:0')
body_pixels: tensor([], device='cuda:0', size=(0, 2), dtype=torch.int64)
  0%|                                                                                             | 0/22776 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "run_train.py", line 156, in <module>
    run_train(device=device,
  File "run_train.py", line 113, in run_train
    train_poseMF_shapeGaussian_net(pose_shape_model=pose_shape_model,
  File "/garmentor/train/train_poseMF_shapeGaussian_net.py", line 204, in train_poseMF_shapeGaussian_net
    crop_outputs = batch_crop_pytorch_affine(input_wh=(pose_shape_cfg.DATA.PROXY_REP_SIZE, pose_shape_cfg.DATA.PROXY_REP_SIZE),
  File "/garmentor/utils/image_utils.py", line 306, in batch_crop_pytorch_affine
    bbox_corners[i, :2], _ = torch.min(body_pixels, dim=0)  # Top left
IndexError: min(): Expected reduction dim 0 to have non-zero size.

jufi2112 added a commit that referenced this issue Jun 20, 2022
jufi2112 added a commit that referenced this issue Jun 21, 2022
jufi2112 added a commit that referenced this issue Jun 23, 2022
@jufi2112
Copy link
Collaborator

Problem is somewhere in rasterizer (as it returns e.g. zbuffer with only -1 values)

@kristijanbartol
Copy link
Owner Author

We should at least know the exact line and whether it's solvable by changing the python code...

@jufi2112
Copy link
Collaborator

jufi2112 commented Jun 26, 2022

I also created a notebook in order to debug the pipeline: https://github.com/kristijanbartol/garmentor/blob/01693bfc5e031294591fe524ed686b3a6fb8a9bb/notebooks/PyTorch3D_Rendering.ipynb
EDIT: seems like I haven't yet pushed my latest code, which shows that the rasterizer returns -1 values everywhere, gonna do this tomorrow

@kristijanbartol
Copy link
Owner Author

Very nice notebook, but please push the latest code. I'm gonna take a look at it also...

@jufi2112
Copy link
Collaborator

Forgot about it, sorry. Will do today in the morning

@jufi2112
Copy link
Collaborator

@kristijanbartol done
In the latest cells, you can see that the fragments from the rasterizer contain only -1 values for their data structures

@kristijanbartol
Copy link
Owner Author

Awesome, thank you! Will take a look at it today. Do you have ANY idea why this might happen (I guess that you might have a bit more experience with computer graphics and rasterization)?

@jufi2112
Copy link
Collaborator

I'm not sure myself, computer graphics isn't really my main field :D
My very naive idea was that maybe the camera is not looking at the mesh and therefore, there is nothing to display. I talked with a colleague about it but from his answer I took that then the values should be 1, not -1
But I can ask Stefan about it when I see him, he should know

@kristijanbartol
Copy link
Owner Author

Note that using PyTorch3D==0.3.0 this thing doesn't happen, using the exact same code...

@jufi2112
Copy link
Collaborator

jufi2112 commented Jun 29, 2022

maybe related to this issue? facebookresearch/pytorch3d#561
I'm currently in a meeting and will not have time until maybe tomorrow to check whether this works

Edit: Nvm, the code already sets it to False by default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants