Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add plot cost in v2 api #1712

Merged
merged 4 commits into from
Mar 29, 2017
Merged

Conversation

jacquesqiao
Copy link
Member

@jacquesqiao jacquesqiao commented Mar 28, 2017

效果见:image

demo code

import paddle.v2.plot.plot_curve as plot_curve

plot_cost = plot_curve.PlotCost()

step = 0

def event_handler(event):
    global step
    if isinstance(event, paddle.event.EndIteration):
        if step % 10 == 0:  # every 10 batches, record a train cost
            plot_cost.append_train_cost(step, event.cost)

        if step % 10 == 0: # every 1000 batches, record a test cost
            result = trainer.test(
                reader=paddle.batch(
                    uci_housing.test(), batch_size=2),
                feeding=feeding)
            plot_cost.append_test_cost(step, result.cost)

        if step % 100 == 0: # every 100 batches, update cost plot
            plot_cost.plot()

        step += 1

from IPython import display


class PlotCost(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

画的不一定是cost, 也可以是auc, classification error之类的东西,但这个画图方式可以公用的。因此名字可以改下:例如:MetricPlotting,记得我们是有metric这个概念的。 下面的cost相关的词都可以改下。

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我在想,auc什么的,是不是应该有个专门的类来处理,比如cost和auc的刻度和取值范围不一样,如果放在一起的话,有些曲线可能就看不到了

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import paddle.v2.plot.cost
import paddle.v2.plot.auc
import paddle.v2.plot.classification_error

@typhoonzero
Copy link
Contributor

只是记录下这个问题:

这个PR merge之后,需要在production docker镜像中安装matplot依赖

@jacquesqiao
Copy link
Member Author

jacquesqiao commented Mar 28, 2017

需要在production docker镜像中安装matplot依赖

这个问题我也想过,我看了下matplotlib有80M+,而这个plot功能应该是在notebook中会用到比较多,所以需要为这个功能增加一个这么大的依赖么?想听听大家的意见
@typhoonzero @qingqing01 @wangkuiyi @reyoung

@@ -0,0 +1,37 @@
import matplotlib.pyplot as plt
from IPython import display

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果这个目录(package)叫做plot;这个文件(module)叫做cost.py,将来是不是可以有更多的画图功能,比如叫

import paddle.plot.cost
import paddle.plot.auc
...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个很赞~

@@ -0,0 +1,37 @@
import matplotlib.pyplot as plt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个module会导致production image依赖很大的matplotlib。是不是应该把这个module从 PaddlePaddle/Paddle 挪到 PaddlePaddle/book ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1,目前的这种写法,用户不主动import paddle.v2.plot_curve是不会报错的,如果非要用的话,可以认为是打开了notebook在写代码。
2,book中没有放python lib的位置,如果放到book中,代码就不能彼此共享了,所以还是应该放到可以打包到paddle-xxx.whl中并且被安装的地方。
3,实际上现在的python lib中已经有部分代码import matplotlib了,也是不调用也就不会有问题

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明白了。所以这段代码是可以放在PaddlePaddle/Paddle里的。而matplotlib是可以不安装在production image,而只是安装在book image里的。是把?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对的~如果用户真的想在production image使用这部分功能,需要自己手动安装matplot里边

Copy link
Collaborator

@wangkuiyi wangkuiyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

下一轮PR里我们可以来改名字:

plot_curve.py => cost.py
class PlotCost => class Cost

@jacquesqiao jacquesqiao merged commit 4c6dee9 into PaddlePaddle:develop Mar 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants