Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending BCEWithLogitsLoss to non-binary labels #281

Open
f-dangel opened this issue Nov 4, 2022 · 0 comments
Open

Extending BCEWithLogitsLoss to non-binary labels #281

f-dangel opened this issue Nov 4, 2022 · 0 comments
Labels
enhancement New feature or request

Comments

@f-dangel
Copy link
Owner

f-dangel commented Nov 4, 2022

BackPACK's extensions that rely on the probabilistic interpretation of a loss function as a negative log likelihood (quantities based on the Fisher, i.e. BatchDiagGGNMC, DiagGGNMC, SqrtGGNMC, KFAC) are limited to binary labels for BCEWithLogitsLoss.

This issue serves as documentation for the required steps and problems to support continuous-valued labels.

Description: Currently, we assume binary labels $y_n \in {0; 1}$. In this case, BCEWithLogitsLoss corresponds to the negative log likelihood of a Bernoulli distribution $p(y \mid f_n)$ with $f_{n} \in (0; 1)$ the sigmoid probability.

But BCEWithLogitsLoss also supports continuous labels $y_n \in [0; 1]$. In this case, BCEWithLogitsLoss corresponds to negative log likelihood of a continuous Bernoulli distribution $p(y \mid f_{n}) \propto f_{n}^{y} (1 - f_n)^{1 - y}$, such that $- \log p(y=y_{n} \mid f_{n}) \propto -y_{n} \log(f_n) - (1 - y_n) \log(1 - f_n)$.

Implementation: Depending on the nature of labels (binary or continuous), a different distribution must be used (Bernoulli or continuous Bernoulli) to compute sampled gradients. However, at the moment the _make_distribution function does not take into account the labels, but only receives the subsampled inputs. Hence, the interface must be adapted in order to support continuous labels in BCEWithLogitsLoss.

Problems:

  • A problem with that is that this approach would determine at run time, which properties the labels satisfy. If however we're using a data set with non-binary labels, but coincidentally feed a batch with binary labels (or a single sample), then this approach will use the wrong distribution. Not sure how to fix this, other than asking the user for the nature of their data.
@f-dangel f-dangel added the enhancement New feature or request label Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant