-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing Support for BatchNorm and AdaptiveAvgPool in HBP methods (KFAC, KFRA, KFLR) and GGNMP #322
Comments
Hi, thanks for bringing up this documentation inconsistency. You are right that the docs claim too general support, although it is true that most of the second-order extensions support the mentioned layers. If you would like to see support for a specific layer and quantity that is currently missing, please feel free to specify. |
Hi, thanks for getting back quickly. I am looking to experiment with KFLR and a Conjugate Gradient optimiser (using GGNMP) on a resnet18 model. I am fine running in eval mode, but both of these extensions do not have definitions for BatchNorm and AdaptiveAvgPool right now, which throws an error. Support for these would be greatly appreciated. |
I tried manually making some changes, but I see that the issue is more than just missing links to the module extensions... |
Thanks for boiling things down. I think what you are requesting requires a lot of new additions to BackPACK. They are not impossible to realize, but you would be mostly on your own to realize them.
|
I've made the following changes to get GGNMP working with the SumModule, but it seems my gradients are vanishing: Accumulate for ggnmp:
Sum Module for GGNMP:
SumModule._jac_mat_prod:
Are you able to help? |
Hi, thanks for the update. Best, |
I am using GGNMP alongside the implementation of a conjugate gradient optimiser provided as an example here. Just printing gradients (p.grad) at each optimisation step, I see that they become all zeros after a number of steps. It is possible that this is due to the modified resnet I am testing (disabled average pooling and BatchNorm for the time being as I was just testing to see if the summodule implementation was correct). Thanks, |
I'm not sure if debugging the correctness of |
Unsure if this is similar to a previous issue where there was simply a missing link, or whether there is a more fundamental reason why these, and other modules which the documentation claims are supported, are not actually supported.
The text was updated successfully, but these errors were encountered: