Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poisson Regression Same Results in R as in Python? #376

Open
gaussianZeroOne opened this issue Mar 13, 2020 · 6 comments
Open

Poisson Regression Same Results in R as in Python? #376

gaussianZeroOne opened this issue Mar 13, 2020 · 6 comments

Comments

@gaussianZeroOne
Copy link

gaussianZeroOne commented Mar 13, 2020

I am trying to implement the same code from R using glmnet and in Python. The issue is that I am using Poisson regression.

In R, I want to run the following regression. A vector y on 3 variables. It is a simple example but I want to keep it simple to easily compare results in Python:

library(glmnet)
y= c(2.631007e-13, 2.953114e-09, 1.537151e-03)

covar = diag(3)

mod <- glmnet(x=covar, y=y, family="poisson")

The above produces the following coefficients: [1] 0.000000 0.000000 7.557667. Only the 3rd variable has any weight from the penalization.

In Python, using this package:

from pyglmnet import GLM

import numpy as np

glm = GLM(alpha=1, distr='poisson', tol=1e-07)

print(glm.fit(np.eye(3),np.array([2.631007e-13, 2.953114e-09, 1.537151e-03])).beta_)

This gives me [0,0,0] for the 3 coefficients.

I assume I may need to use K-Fold validation, and also search across various lambda (penalization) parameters. However, I am unsure how exactly to do this and if this is actually a feature of pyglmnet?

Has anyone been able to figure out how to run glmnet with poisson in Python to produce exactly the same results?

@jasmainak
Copy link
Member

yes it is! You should use GLMCV to search across various lambda. Can you make sure that these are matched between R and Python?

@gaussianZeroOne
Copy link
Author

gaussianZeroOne commented Mar 16, 2020

Thank you @jasmainak.

I've tried the following in pyglmnet:

from pyglmnet import GLMCV

glm = GLMCV(alpha=1, distr='poisson', tol=1e-07, cv=2)

print(glm.fit(np.eye(3),np.array([2.631007e-13, 2.953114e-09, 1.537151e-03])).beta_)

However it produces [0, 0, 0]. I am unable to tweak the pyglmnet to produce the exact same code as in R's glmnet.

Could it be that the lambda sequences are different? I noticed in R's glmneet it produces the sequence of coefficients through each iteration. Is there a way to do this as well in pyglmnet to see if the differences are due to different starting points?

@jasmainak
Copy link
Member

jasmainak commented Mar 17, 2020

@gaussianZeroOne could you also provide the R code that you have tried? I am not that familiar with R but I think @titipata might be able to help.

@titipata
Copy link
Collaborator

@jasmainak, I edited the code above which should have the R code to reproduce.

@titipata
Copy link
Collaborator

And yes, it seems like there is something going on with the optimizer. I tried the same example and couldn't get the same result as in R. Maybe @pavanramkumar knows more in detail why it does not give the same result in this case.

@jasmainak
Copy link
Member

cool thanks, I'll take a look tomorrow!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants