Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in Adam for non-real parameters #1051

Open
dvicini opened this issue Jan 25, 2024 · 3 comments
Open

Bug in Adam for non-real parameters #1051

dvicini opened this issue Jan 25, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@dvicini
Copy link
Member

dvicini commented Jan 25, 2024

Hi,

There appears to be an edge case in the current Adam implementation. If the optimizer contains non-real variables, i.e., quaternions or complex numbers, the variance estimation might not be what we want.

The Adam implementation computes

v_t = self.beta_2 * v_tp + (1 - self.beta_2) * dr.sqr(g_p)

where g_p is the parameter gradient. If a parameter is a quaternion or complex number, the gradient will be of the same type. Then, dr.sqr will compute a quaternion or complex number product. This is different from the element-wise product that we would expect here.

While there is some literature on this subject (pytorch github issue, pytorch docs, https://arxiv.org/pdf/0906.4835.pdf), the easiest practical solution is to just optimize using Vector2f or Vector4f instead.

Maybe Mitsuba should raise an error if a quaternion or complex value is assigned to the optimizer?

@dvicini dvicini added the bug Something isn't working label Jan 25, 2024
@njroussel
Copy link
Member

Hi @dvicini

I agree with you. My suggestion would be a check to see if the type is special (dr.is_special_v()) and if it is the case raise an error as you said.

I'll leave this open for a few more days before pushing a fix, in case anyone else has some ideas to contribute :)

@wjakob
Copy link
Member

wjakob commented Jan 29, 2024

The same issue would also appear with camera matrices and, e.g., complex IOR values in the next version.

The optimizer will need an extra check like

tp = type(g_p)
if dr.is_special_v(tp):
    tp_a = dr.array_t(tp)
    g_p = tp_a(tp)
    ...
v_t = self.beta_2 * v_tp + (1 - self.beta_2) * dr.sqr(g_p)

@wjakob
Copy link
Member

wjakob commented Jan 29, 2024

We could either wait until the next version or backport the dr.array_t type trait.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants