Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vignette on sampling parameter values #1456

Open
RyanGutenkunst opened this issue Mar 14, 2023 · 3 comments
Open

Vignette on sampling parameter values #1456

RyanGutenkunst opened this issue Mar 14, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@RyanGutenkunst
Copy link
Contributor

We could use a vignette on sampling parameter values for a given model. For example, how does one easily simulate with various values for a divergence time in a given model?

Several users have requested this functionality, because it can be core for training machine learning models. It will also be useful for generating test data for Ryan's proposed competition.

It's been said that we can do this with the current Python API. A simple example sampling over a single parameter probably suffices to illustrate the principle. Depending on how ugly it is, maybe we'll want to develop a cleaner API for modifying parameters programatically.

@RyanGutenkunst RyanGutenkunst added the enhancement New feature or request label Mar 14, 2023
@grahamgower
Copy link
Member

grahamgower commented Mar 15, 2023

There is some earlier discussion in #299.

As far as I can tell, the simplest way to do this currently is to copy/paste/modify a stdpopsim demographic model function to accept and propagate parameters. A single model instance is constructed by calling the model function with desired parameter values and then simulated directly using msprime (or perhaps with the stdpopsim Engine.simulate() wrapper). Obviously, this is unsatisfying.

With regard to making an API to do this with stdpopsim directly, here are some random thoughts:

  • The current demographic models are all static (except for the "generic" models, e.g. PiecewiseConstantSize). These would all need to be changed to be parameterised, and the functions would need to be exported in the API (rather than just having the static "registered" models available via the API).
  • The choice of how to parameterise a model can be important. E.g using relative event times, rather than absolute times.
  • The choice of which model elements are parameterised is somewhat arbitrary. Is the number of population size changes a free parameter? What about the number of migration pulses? What about the number of populations?
  • What about parameters that aren't strictly part of the demographic model? E.g. mutation rates, recombination rates, selection coefficients, time of selection onset?
  • If the catalog is converted to Demes YAML files (Convert demographic models to Demes YAML files #1233), the implementation details woud be different. Here's a simple example of how I've been parameterising Demes models (for training machine learning models), and here's a more complex example. Note that we explicitly rejected the idea of including parameterisation within the Demes spec itself (to avoid an explosion of complexity in parser implementations).

@petrelharp
Copy link
Contributor

petrelharp commented Mar 15, 2023

Hm - what do you mean that they are static, @grahamgower? Here in a simple example:

species = stdpopsim.get_species("HomSap")
model = species.get_demographic_model("OutOfAfrica_3G09")
model.model.populations[0].growth_rate
# 0.0
model.populations[0].growth_rate = 1.0
model.populations[0].growth_rate
# 1.0
model.generation_time
# 25
model.generation_time = 30
model.generation_time
# 30

... so, I think this should work? It is not ideal as an API, as changing these things actually changes the values in the stdpopsim object: after the above, we have

model = species.get_demographic_model("OutOfAfrica_3G09")
model.generation_time
# 30

(Perhaps get_demographic_model( ) should return a copy?)

However, I think the YAML proposal (see #1256) is a much better way to go.

@grahamgower
Copy link
Member

Ah, ok. I guess parameters can be modified in this way, including modifying events in model.model.events, but these are definitely implementation details and this seems like an easy way to make mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants