Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misaligned summary intervals cause different trajectory plots for identical models #104

Closed
lisa55asil opened this issue Jan 12, 2020 · 5 comments

Comments

@lisa55asil
Copy link
Contributor

With the new default random seed parameter we should get identical models when running songbird multinomial, however as @gwarmstrong pointed out to me because the 'check points' from tensorflow are not identical across runs the resulting plots may not align nicely (see crappy model fit below) even though the end points are identical.

It might be worthwhile to specify checkpoints to avoid confusion?

image

Code for reference:

qiime songbird multinomial
--i-table ../../table-deblur250-min1k-noMitoChlor.qza
--m-metadata-file ../../../../metadata-final-05282019.tsv
--p-formula "perbop"
--verbose
--o-differentials q2-differentials-t1.qza
--o-regression-stats q2-regression-stats-t1.qza
--o-regression-biplot q2-regression-biplot-t1.qza

qiime songbird multinomial
--i-table ../../table-deblur250-min1k-noMitoChlor.qza
--m-metadata-file ../../../../metadata-final-05282019.tsv
--p-formula "perbop"
--verbose
--o-differentials q2-differentials-t2.qza
--o-regression-stats q2-regression-stats-t2.qza
--o-regression-biplot q2-regression-biplot-t2.qza

qiime songbird summarize-paired
--i-regression-stats q2-regression-stats-t1.qza
--i-baseline-stats q2-regression-stats-t2.qza
--o-visualization q2-regression-stats-t12.qzv

@mortonjt
Copy link
Collaborator

mortonjt commented Jan 12, 2020 via email

@lisa55asil
Copy link
Contributor Author

Yes! I mean the summary interval.... my bad.
Just trying to say it would be ideal if the plots aligned perfectly when running the same model twice to avoid confusion :)

@lisa55asil lisa55asil changed the title Misaligned check points causes different trajectory plots for identical models Misaligned summary intervals cause different trajectory plots for identical models Jan 12, 2020
@mortonjt
Copy link
Collaborator

Hi @lisa55asil , here is the same issue that @fedarko raised.

#75

I'm going to close this, since it is a little redundant.

@gwarmstrong
Copy link
Member

@mortonjt I think this issue is slightly different than #75 and should be considered being re-opened.

The issue in #75 is that it is hard to compare models with different summary intervals. However, @lisa55asil used the same summary interval (default) in her commands above. The reason for the differences in her plots is due to the time-based summary-interval in MultRegression.fit.
So in the plots above, for example if the summary interval is 1s, the differences are due to the fact that when the clock has elapsed 1s, model 1 has gotten to iteration 1000, but model 2 has gotten to iteration 1050, due to something like different hardware or OS concerns. Even though Lisa didn't change anything about her command.

I think its important to reiterate, as in #101 , that QIIME users should be able to expect, or at least be provided a way, to receive the same results if running the same command. Having time-based updates when running with tensorboard is nice because you can potentially get updated on more intermediate results if you want, which may be helpful for a quicker or more fine-grained trouble-shooting of your model. But when running with Q2 we don't really know what we got until the end anyways, so I see no benefit to time-based updates vs. iteration based updates.

I think a somewhat graceful potential solution to this issue might look like something along the following lines:

  1. Adding argument to MultRegression.fit(..., summary_iteration_interval=None) where summary_iteration_interval is an optional Int
  2. If summary_iteration_interval is None, summarize the model in the current way
  3. Otherwise, record summaries when i % summary_iteration_interval ==0, and summary_interval is ignored
  4. Expose the new argument to CLI and QIIME2
  5. Adding an FAQ in the readme about this new argument

This solution should allow for the behavior to incorporated without breaking any existing scripts.

I'm happy to write-up a PR that does this.

@lisa55asil
Copy link
Contributor Author

lisa55asil commented Jan 13, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants