Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add momentum operator #4571

Merged
merged 3 commits into from
Oct 18, 2017
Merged

Conversation

sidgoyal78
Copy link
Contributor

@sidgoyal78 sidgoyal78 commented Oct 3, 2017

This PR adds the implementation of momentum operator.

In summary, we want to perform the update with a new velocity vector, such that,

 velocity = mu * velocity + grad
 param = param - learning_rate * velocity

(where mu is the momentum coefficient).

auto p = EigenVector<T>::Flatten(*ctx.Input<Tensor>("Param"));
auto g = EigenVector<T>::Flatten(*ctx.Input<Tensor>("Grad"));
auto v = EigenVector<T>::Flatten(*ctx.Input<Tensor>("Velocity"));
float lr = ctx.Input<Tensor>("LearningRate")->data<float>()[0];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be not good for GPU. If the LearningRate is in GPU memory, we cannot get float directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as per #4598

@sidgoyal78 sidgoyal78 requested review from dzhwinter and removed request for abhinavarora October 5, 2017 23:13
Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

class MomentumOpKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
auto param_out = ctx.Output<framework::Tensor>("ParamOut");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These parameters will be better if can be named with an auto * to indicate the real type.

@dzhwinter
Copy link
Contributor

Thanks for this PR! @sidgoyal78 . Since our book chapters heavily depend on these optimizer operators, so merge this PR ASAP. We can leave the name style unified work in the future.

@dzhwinter dzhwinter merged commit fd96914 into PaddlePaddle:develop Oct 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants