Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate task preprocessing from simulation execution #399

Closed
20 tasks done
jonrkarr opened this issue Sep 12, 2021 · 4 comments
Closed
20 tasks done

Separate task preprocessing from simulation execution #399

jonrkarr opened this issue Sep 12, 2021 · 4 comments
Assignees

Comments

@jonrkarr
Copy link
Member

jonrkarr commented Sep 12, 2021

  • Refactor simulators
    • AMICI
    • BioNetGen
    • BoolNet
    • CBMpy
    • COBRApy
    • COPASI
    • GillesPy2
    • GINsim
    • libSBMLSim
    • MASSpy
    • OpenCOR
    • pyNeuroML/NEURON/NetPyNE
    • PySCeS
    • RBApy
    • Smoldyn
    • tellurium
    • XPP
  • Update integrated BioSimulators pipenv and Docker image
  • Update pipenv for BioSimulations combine-service used for low-latency online simulation

Notes on limitations

  • preprocess_sed_task should be re-run if any of these conditions are met
    • model structure must be changed (e.g., additional species, reactions)
    • simulation algorithm or algorithm parameters changed
    • additional attributes (parameters, initial conditions) need to be changed -- its best to outline all attributes that might need to be changed upon the initial call of preprocess_sed_task
    • additional variables need to be recorded -- its best to outline all variables that might need to be recorded upon the initial call of preprocess_sed_task
  • Because some simulator representations of models diverge from their associated model languages, some changes that can be applied to model specifications cannot easily be applied to in-memory simulation representations of model
    • The SBML-fbc representation of FBA models diverges a little from how simulation tools represent models. In particular, SBML-fbc uses a small number of parameters to represent flux bounds. In contrast, simulation tools flatten this out to separate parameters for each upper and lower bound of each reaction. These low-dimensional parameters can be changed at the model specification (XML) level, but are difficult to change at the simulator level because simulators don't retain knowledge of these parameters. Due to this divergence, we support two different mechanisms for changing FBA models
      • exec_sed_task: supports model changes on the simulator representation of models. This should work well for Vivarium. Presently, this is limited to changing flux bounds.
      • Execution of SED-ML files and COMBINE archives: supports model changes on the XML representation of models. This supports the full set of possible changes: change attributes and add/remove/replace XML nodes
    • The Smoldyn software also diverges from Smoldyn simulation configurations. For example, the Smoldyn software does not retain information about parameter values.
      • As a result, parameters can only be edited during task preprocessing when simulation configuration files are read
      • In contrast, molecule counts can be set repeated as part of task execution
  • Some simulation tools don't represent or provide ways to set initial levels
  • For some simulation tools, repeated executions of exec_sed_task require re-parsing models
@jonrkarr
Copy link
Member Author

@eagmon, the progress on factoring out unnecessary computations for repeated execution is summarized above.

The preprocessed information is sufficient to change values of parameters and initial conditions. Presently, more substantial changes such as adding/removing/replacing species/reactions would require re-preprocessing models.

For SBML and CellML, this follows their SED-ML conventions of using XML XPaths to address model components. Once this refactoring is done, we can work on a second, simpler way of addressing model components by their SBML/CellML ids. At least to start, this would be restricted to changing values of parameters and initial conditions. Adding/removing/replacing components would only be supported at the XML level where there's already a convention for describing such changes.

@eagmon
Copy link
Contributor

eagmon commented Sep 12, 2021

@jonrkarr -- Looks like good progress. I know from our work on biosimulators-tellurium that we used exec_sed_task and preprocess_sed_task methods -- are these same methods available for all simulators with ✅ ? I know biosimulators-cobrapy did not previously have those module attributes.

@jonrkarr
Copy link
Member Author

Until recently, each simulator API had 1 method exec_sed_task. Each API now has two methods

  • exec_sed_task
  • preprocess_sed_task

preprocess_sed_task returns a data structure which essentially represents parsed models and a map between our standard representation of models and simulations (SED-ML/KiSAO) and each simulator's internal representation. This data structure is unique to each simulation tool.

exec_sed_task has an optional argument preprocessed_task for this preprocessed information. If the argument isn't provided, then exec_sed_task has to build this map. Providing this argument avoids any computation common to multiple repeated executions of a single model (typically with different parameters and/or initial conditions).

I've implemented and pushed half of the preprocess_sed_task methods. The others are still just skeletons. I'm hoping to finish that in the next few days.

For constraint-based simulations, there's opportunity to go further to hot start optimizations with some solvers such as CPLEX and Gurobi. This would require changes to the FBA packages, COBRApy and CBMpy.

@jonrkarr jonrkarr self-assigned this Sep 18, 2021
@jonrkarr
Copy link
Member Author

The updated Docker image is released. The entrypoint now opens an iPython shell to the Pipenv environment with all of the simulation tools.

docker pull ghcr.io/biosimulators/biosimulators:0.0.2
docker run -it --rm ghcr.io/biosimulators/biosimulators:0.0.2

The only two standardized tools that aren't included are

  • OpenCOR: Installation is complicated and requires Python 3.7. This group is working toward a more composable simulation library which the core simulation functionality separated from the GUI.
  • VCell: No Python API available. The developers are thinking about creating a Python API. They have an old API that could be a good starting point.

The updated simulation tools are deployed on the main RunBioSimulations simulation service. They will be updated soon on the low latency/low performance service.

More documentation (e.g., Jupyter notebook) is still coming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants