-
Notifications
You must be signed in to change notification settings - Fork 25
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misaligned summary intervals cause different trajectory plots for identical models #104
Comments
I'm a little confused - what's the problem? Are you referring to the
summary interval? The checkpoints refer to saving the model state.
…On Sat, Jan 11, 2020, 10:24 PM Lisa ***@***.***> wrote:
With the new default random seed parameter we should get identical models
when running songbird multinomial, however as @gwarmstrong
<https://github.com/gwarmstrong> pointed out to me because the 'check
points' from tensorflow are not identical across runs the resulting plots
may not align nicely (see crappy model fit below) even though the end
points are identical.
It might be worthwhile to specify checkpoints to avoid confusion?
[image: image]
<https://user-images.githubusercontent.com/20728562/72213638-0c5fa900-34a7-11ea-9841-72129ec03f0a.png>
Code for reference:
qiime songbird multinomial
--i-table ../../table-deblur250-min1k-noMitoChlor.qza
--m-metadata-file ../../../../metadata-final-05282019.tsv
--p-formula "perbop"
--verbose
--o-differentials q2-differentials-t1.qza
--o-regression-stats q2-regression-stats-t1.qza
--o-regression-biplot q2-regression-biplot-t1.qza
qiime songbird multinomial
--i-table ../../table-deblur250-min1k-noMitoChlor.qza
--m-metadata-file ../../../../metadata-final-05282019.tsv
--p-formula "perbop"
--verbose
--o-differentials q2-differentials-t2.qza
--o-regression-stats q2-regression-stats-t2.qza
--o-regression-biplot q2-regression-biplot-t2.qza
qiime songbird summarize-paired
--i-regression-stats q2-regression-stats-t1.qza
--i-baseline-stats q2-regression-stats-t2.qza
--o-visualization q2-regression-stats-t12.qzv
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#104?email_source=notifications&email_token=AA75VXKGQYTX4LZJSGOBJG3Q5KEP3A5CNFSM4KFVOCK2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IFRV2TA>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA75VXIPB4ZE7B5AQWODBSTQ5KEP3ANCNFSM4KFVOCKQ>
.
|
Yes! I mean the summary interval.... my bad. |
Hi @lisa55asil , here is the same issue that @fedarko raised. I'm going to close this, since it is a little redundant. |
@mortonjt I think this issue is slightly different than #75 and should be considered being re-opened. The issue in #75 is that it is hard to compare models with different summary intervals. However, @lisa55asil used the same summary interval (default) in her commands above. The reason for the differences in her plots is due to the time-based summary-interval in I think its important to reiterate, as in #101 , that QIIME users should be able to expect, or at least be provided a way, to receive the same results if running the same command. Having time-based updates when running with tensorboard is nice because you can potentially get updated on more intermediate results if you want, which may be helpful for a quicker or more fine-grained trouble-shooting of your model. But when running with Q2 we don't really know what we got until the end anyways, so I see no benefit to time-based updates vs. iteration based updates. I think a somewhat graceful potential solution to this issue might look like something along the following lines:
This solution should allow for the behavior to incorporated without breaking any existing scripts. I'm happy to write-up a PR that does this. |
Thanks for clarifying George!
If this was just a visualization issue I would say its fine to just add a
comment to the README but you also get a non-zero Q^2 value. I guess that's
because the final values that are used for the calculation may be slightly
different as described above?
I think George's proposed solution would be worth the effort.
…On Sun, Jan 12, 2020 at 2:58 PM George Armstrong ***@***.***> wrote:
@mortonjt <https://github.com/mortonjt> I think this issue is slightly
different than #75 <#75> and
should be considered being re-opened.
The issue in #75 <#75> is that
it is hard to compare models with different summary intervals. However,
@lisa55asil <https://github.com/lisa55asil> used the same summary
interval (default) in her commands above. The reason for the differences in
her plots is due to the time-based summary-interval in MultRegression.fit.
So in the plots above, for example if the summary interval is 1s, the
differences are due to the fact that when the clock has elapsed 1s, model 1
has gotten to iteration 1000, but model 2 has gotten to iteration 1050, due
to something like different hardware or OS concerns. Even though Lisa
didn't change anything about her command.
I think its important to reiterate, as in #101
<#101> , that QIIME users should
be able to expect, or at least be provided a way, to receive the same
results if running the same command. Having time-based updates when running
with tensorboard is nice because you can potentially get updated on more
intermediate results if you want, which may be helpful for a quicker or
more fine-grained trouble-shooting of your model. But when running with Q2
we don't really know what we got until the end anyways, so I see no benefit
to time-based updates vs. iteration based updates.
I think a somewhat graceful potential solution to this issue might look
like something along the following lines:
1. Adding argument to MultRegression.fit(...,
summary_iteration_interval=None) where summary_iteration_interval is
an optional Int
2. If summary_iteration_interval is None, summarize the model in the
current way
3. Otherwise, record summaries when i % summary_iteration_interval ==0,
and summary_interval is ignored
4. Expose the new argument to CLI and QIIME2
5. Adding an FAQ in the readme about this new argument
This solution should allow for the behavior to incorporated without
breaking any existing scripts.
I'm happy to write-up a PR that does this.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#104?email_source=notifications&email_token=AE6EV4XLGIPY2Z4OYKQLUA3Q5OOAVA5CNFSM4KFVOCK2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIXGTDA#issuecomment-573467020>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE6EV4RYLOXSHJB3U2VRHLDQ5OOAVANCNFSM4KFVOCKQ>
.
|
With the new default random seed parameter we should get identical models when running songbird multinomial, however as @gwarmstrong pointed out to me because the 'check points' from tensorflow are not identical across runs the resulting plots may not align nicely (see crappy model fit below) even though the end points are identical.
It might be worthwhile to specify checkpoints to avoid confusion?
Code for reference:
qiime songbird multinomial
--i-table ../../table-deblur250-min1k-noMitoChlor.qza
--m-metadata-file ../../../../metadata-final-05282019.tsv
--p-formula "perbop"
--verbose
--o-differentials q2-differentials-t1.qza
--o-regression-stats q2-regression-stats-t1.qza
--o-regression-biplot q2-regression-biplot-t1.qza
qiime songbird multinomial
--i-table ../../table-deblur250-min1k-noMitoChlor.qza
--m-metadata-file ../../../../metadata-final-05282019.tsv
--p-formula "perbop"
--verbose
--o-differentials q2-differentials-t2.qza
--o-regression-stats q2-regression-stats-t2.qza
--o-regression-biplot q2-regression-biplot-t2.qza
qiime songbird summarize-paired
--i-regression-stats q2-regression-stats-t1.qza
--i-baseline-stats q2-regression-stats-t2.qza
--o-visualization q2-regression-stats-t12.qzv
The text was updated successfully, but these errors were encountered: