Memory access issues during optimization with wrap_ad and a single element batch #1007

struminsky · 2023-12-11T15:09:39Z

Summary

While using a batch of sensors with a single element in the batch, I encounter illegal memory access on the second backward pass. The issue occurs when I use wrap_ad to integrate the renderer into pytorch-based optimization loop.

System configuration

System information:

OS: Ubuntu 22.04.3 LTS
CPU: 12th Gen Intel(R) Core(TM) i7-12700
GPU: NVIDIA GeForce RTX 4080
Python: 3.8.16 (default, Mar 2 2023, 03:21:46) [GCC 11.2.0]
NVidia driver: 535.129.03
CUDA: 12.3.103
LLVM: -1.-1.-1

Dr.Jit: 0.4.3
Mitsuba: 3.4.0
Is custom build? False
Compiled with: GNU 10.2.1
Variants:
scalar_rgb
scalar_spectral
cuda_ad_rgb
llvm_ad_rgb

Description

I am trying to render a scene from multiple views and use the outputs to compute a loss using pytorch (kudos for the interoperability and the great documentation). My code works fine with $n > 1$ sensors, but with a single sensor I encounter illegal memory access during the second backward pass. Analogous code with drjit-based optimization also works fine.

The error message I get is

Critical Dr.Jit compiler failure: cuda_check(): API error 0700 (CUDA_ERROR_ILLEGAL_ADDRESS): "an illegal memory access was encountered" in /project/ext/drjit-core/src/malloc.cpp:237.
Aborted (core dumped)

The code below should reproduce the bug.

import mitsuba as mi
import torch
import drjit as dr

mi.set_variant('cuda_ad_rgb')

def get_sensor_batch(batch_size):
    sensor = {
        'type': 'batch',
        'film': {
            'type': 'hdrfilm',
            'width': 256 * batch_size, 'height': 256,
            'sample_border': True
        }
    }
    for it in range(batch_size):
        sensor[f'sensor_{it}'] = ({
            'type': 'perspective',
            'to_world': mi.ScalarTransform4f.look_at(
                target=[0.0, 0.0, 0.0],
                origin=[1.0, 0.0, 0.0],
                up=[0.0, 0.0, 1.0]
            ),
            'fov': 45,
        })
    return mi.load_dict(sensor)

def main(batch_size):
    scene = mi.load_file('./dragon/scene.xml')  # a scene from the gallery
    params = mi.traverse(scene)
    key = 'DirectionalEmitter.irradiance.value'

    @dr.wrap_ad(source='torch', target='drjit')
    def my_render(param, batch_size=4):
        sensor_batch = get_sensor_batch(batch_size)
        params[key] = dr.mean(param)  # I did not find a better way to convert tensor into float
        params.update()
        return mi.render(scene, params, sensor=sensor_batch, spp=4)

    device = torch.device('cuda:0')
    target = torch.eye(3, device=device)[0]
    param = torch.zeros([1], device=device, requires_grad=True)
    optimizer = torch.optim.Adam([param], lr=1e-2)

    for it in range(32):
        image = my_render(param, batch_size)
        loss = (image - target).abs().mean()
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

if __name__ == '__main__':
    batch_size = 1
    main(batch_size)

Steps to reproduce

Download the dragon scene from the gallery
Run the above script

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory access issues during optimization with wrap_ad and a single element batch #1007

Memory access issues during optimization with wrap_ad and a single element batch #1007

struminsky commented Dec 11, 2023 •

edited

Loading

Memory access issues during optimization with wrap_ad and a single element batch #1007

Memory access issues during optimization with wrap_ad and a single element batch #1007

Comments

struminsky commented Dec 11, 2023 • edited Loading

Summary

System configuration

Description

Steps to reproduce

struminsky commented Dec 11, 2023 •

edited

Loading