Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing the Adamax optimizer operator #4538

Merged
merged 14 commits into from
Oct 9, 2017

Conversation

abhinavarora
Copy link
Contributor

@abhinavarora abhinavarora commented Sep 30, 2017

Fixes #4515

float beta_1 = ctx.Attr<float>("beta_1");
float beta_2 = ctx.Attr<float>("beta_2");
float epsilon = ctx.Attr<float>("epsilon");
int t = ctx.Attr<int>("time_step");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestep should not be an attribute. It should be an input of Adamax. That input type could be int.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback. I will change this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reyoung Fixed in abd6181



class TestAdamaxOp1(OpTest):
def setUp(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You provided two TestAdamaxOp functions and commented that the second one is for testing default attributes. I think it would be helpful to also add a comment for TestAdamaxOp1 explaining its purpose. Also, I didn't find any differences between the two test functions. If the first function is to test explicit attributes, you should change the attribute values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I forgot to remove the attributes from the second one . Will change this.

kexinzhao
kexinzhao previously approved these changes Oct 6, 2017
}

def test_check_output(self):
self.check_output()
Copy link
Member

@jacquesqiao jacquesqiao Oct 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test of this kind of operator(Optimizer with state) should be more complex because we have accumulated state. The state will change when running, so the test code should run multiple times to check if the state is right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in af36e75

Copy link
Member

@jacquesqiao jacquesqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! LGTM~

@abhinavarora abhinavarora merged commit 4cb5bd9 into PaddlePaddle:develop Oct 9, 2017
@abhinavarora abhinavarora deleted the adamax branch October 9, 2017 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants