Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spatial: Address SBML Editor comments #381

Closed
29 tasks done
luciansmith opened this issue Jul 22, 2022 · 23 comments
Closed
29 tasks done

Spatial: Address SBML Editor comments #381

luciansmith opened this issue Jul 22, 2022 · 23 comments
Assignees

Comments

@luciansmith
Copy link
Member

luciansmith commented Jul 22, 2022

  • SampledFields should probably be introduced before the GeometryDefinitions, i.e. move the list up in Geometry and define the SampledFields before the GeometryDefinitions. Especially the SampledFieldGeometry do not make any sense before the SampledFields when reading the specification.
  • p5L10 cite the latest SBML publication in addition, i.e., Keating2020 in addition to Hucka2003. The Keating paper presents SBML level 3 which is the basis for spatial
  • p5L17 “store or exchange these geometries” -> “store or exchange geometries” (unclear what these refers to exactly; sentence reads clearer without these)
  • p5l25, reference the latest biomodel publication Rahuman2020 (see https://www.ebi.ac.uk/biomodels/citation)
  • p5L26 “a open access” -> “an open access”
  • p5L41 “Additional considerations when considering”; A lot of considering in this sentence, reformulate.
  • p6L3 the URL https://sbml.svn.sf.net/svnroot/sbml/trunk/specifications/sbml-level-3/version-1/spatial/proposal is incorrect. Please reference the github specifications repository. Also update the webpage to reference the latest document: https://sbml.org/documents/specifications/level-3/version-1/spatial/
  • p7L4 “compartments with duplicate species and reactions” -> “compartments with duplicate species and reaction information”. Technically by localizing a species in a different compartment it is not a “duplicate/copy” any more, but a new species with a different localization. It just shares most information with the original.
  • p7L5 “This approach hard-codes the numerical methods” -> “This approach hard-codes the numerical methods and discretization schema”
  • p9L39ff Remove the following paragraph “Other CoordinateKind types are held in reserve for future versions of this specification, and may include “spheri- calRadius”, “sphericalAzimuth”, “sphericalElevation”, “cylindricalRadius”, “cylindricalAzimuth”, “cy- lindricalHeight”, “polarRadius”, and “polarAzimuth”.” Not sure this helps here or adds additional confusion. I would remove this to not overload the reader with information.
  • p10L25ff “ Other GeometryKind types are held in reserve for future versions of this specification, and may include “cylindrical”, “spherical”, and “polar”. Attributes of type GeometryKind cannot take on any other values. The meaning of these values is discussed in the context of the Geometry class’s definition in Section 3.15 on page 19.” Remove this, see comment above.
  • p11L9; color is off in ““union”, “intersection”, and “difference”.”
  • p12L20f I don’t understand the second part of the following sentence “The mathematical value of a CompartmentMapping is its unitSize attribute, and can be bound to a Parameter by using a SpatialSymbolReference.” Probably this needs clarification. Not sure how this can be bound to a parameter if the unitSize is a double.
  • p13L12 “If connected to a Parameter via a SpatialSymbolReference, an InitialAssignment may override the value of the unitSize attribute.” same as comment above. I don’t understand where this mapping should happen? The UnitSize is a double and cannot be a reference ?!
  • p13L32 The second part of the following sentence is unclear: “the Species is spatially distributed in a possibly nonhomogeneous manner within the Domains of the same type as the mapped DomainType.” What is the mapped DomainType of the species? Is this the mapped domainType of the compartment in which the species is located? If yes this should be clearly stated.
  • p15L4ff The description in 3.10.2 applies for 3D geometries; probably not true for 2D and 1D geometries. There is probably some additional clarification required.
  • p15L17 typo: “no other diffusuion coefficients” -> “diffusion”
  • p23L38 typo “The domainTpe attribute” -> “domainType”
  • p29 UML diagram Figure 13; “analyticVolume” -> “sampledVolume”
  • p39L14 “It is suggested, but not required, that if the data is uncompressed, that the grouped points be separated from each other with the use of a semicolon.”; remove the use of semicolon completely. This adds additional confusion and possibilities for errors.
  • p42L2 typo “The tokengeometryDefinition attribute”
  • p43L20ff “If the Geometry defines a 10 cm by 10 cm square, and a SampledField is a 10x5 array, the “[0,0]” entry in the array will correspond to the point “0 cm, 0 cm” in the Geometry, and the “[10,5]” entry in the array will correspond to the point “10 cm, 10 cm” in the Geometry.” => The indices are incorrect. If this is a 0-index array the points are [0,0] and [9,4], not [10,5] in a 10x5 array.
  • p44L6 “A value of “nearestNeighbor” means that the nearest lattice point value is to be returned.” => nearest makes only sense if a norm is defined (this should be most likely L2 Norm in cartesian coordinates). The norm should be stated.
  • p44 Section 3.50.2 I can imagine that NaN values appear in the arrays, e.g. if these come from image analysis (above/below threshold values). How should the interpolation deal with such cases? Are no NaN values allowed?
  • p44 Section 3.50.5 The samplesLength should always be the array length. It should not change meaning on compression. Same for pointIndexLength in section 3.46.5
  • p44L41 “It is suggested, but not required, that if the data is uncompressed, that the grouped points be separated from each other with the use of a semicolon. If the data is compressed, a semicolon is…”; I would just get rid of the semicolon. This adds additional confusion and possibilities for problems between uncompressed and compressed data.
  • p45L33, I think the samplesLength is incorrect; 515923=69207, but the example states 1255; Same error on p30L32
  • p46 Examples None of the links to the examples is working. These are broken with the website migration, e.g., http://sbml.org/examples/spatial/version-1/analytic_3d.xml
  • p48 L10 “None of these objects are defined to be in the SId namespace of the Model”; The same is true with the UnitSId namespace. It would be a good idea to include this in the validation rule.
@luciansmith luciansmith self-assigned this Jul 22, 2022
luciansmith added a commit that referenced this issue Jul 23, 2022
#381
* Re-order sections
* Cite 2020 SBML paper
* a open -> an open
luciansmith added a commit that referenced this issue Jul 23, 2022
@luciansmith
Copy link
Member Author

Note: barring any contrary feedback from the group, my plan is to address all of these points in the next draft.

@luciansmith
Copy link
Member Author

One objection from me:

Section 3.50.5 The samplesLength should always be the array length. It should not change meaning on compression. Same for pointIndexLength in section 3.46.5

I don't think we should make this change for a few reasons:

  • I think the array length is kind of pointless to begin with, but @fbergmann wanted it.
  • Frank wanted it to change the way it does, so OK.
  • Changing it now would break existing code.
  • For any change that would break something, it should have a compelling use case to change it to, but nobody is actually asking for this.

luciansmith added a commit that referenced this issue Jul 25, 2022
@luciansmith
Copy link
Member Author

@matthiaskoenig : You wrote:

p12L20f I don’t understand the second part of the following sentence “The mathematical value of a CompartmentMapping is its unitSize attribute, and can be bound to a Parameter by using a SpatialSymbolReference.” Probably this needs clarification. Not sure how this can be bound to a parameter if the unitSize is a double.

This is the same language that's used for all Spatial elements with mathematical meaning. I guess the issue was the it seemed like the SpatialSymbolReference was being bound to an attribute? What about this:

"The mathematical value of a CompartmentMapping is its unitSize attribute, and can be bound to a Parameter by using a SpatialSymbolReference to the CompartmentMapping id."

This is the general 'spatial ids with mathematical meaning can be connected to core math/elements through the SpatialSymbolReference construct'.

Is this reasonable, or is there a better way to phrase it?

@luciansmith
Copy link
Member Author

@fbergmann : I think this is correct, but can you confirm? This is the 'isSpatial' attribute of a (core) Species.

"The isSpatial attribute is of data type boolean. If it is set to true, the Species is spatially distributed in a possibly nonhomogeneous manner within the Domains of the DomainType of the Compartment in which the Species resides."

@luciansmith
Copy link
Member Author

luciansmith commented Jul 25, 2022

@matthiaskoenig : You wrote:

p39L14 “It is suggested, but not required, that if the data is uncompressed, that the grouped points be separated from each other with the use of a semicolon.”; remove the use of semicolon completely. This adds additional confusion and possibilities for errors.

Dropping allowing semicolons entirely is not really reasonable at this point, and I can confirm that they are invaluable when visually inspecting these lists. They just act like whitespace; there's no real way that they can cause errors.

I reduced the suggestion intensity, though, replacing the above with the statement:

"A semicolon may be used in uncompressed data to visually distinguish grouped values."

Will that work?

@luciansmith
Copy link
Member Author

@jcschaff @fbergmann and anyone else who would know: Diffusion coefficients are defined in terms of 3D geometries, with the following types:

  • anisotropic: only one dimension
  • tensor: two dimensions
  • isotropic: all dimensions.

For 2D geometries, 'tensor' and 'isotropic' would presumably be the same. Do we want to require isotropic to be used instead of tensor? Or allow both?

For 1D geometries, presumably 'tensor' is illegal, and isotropic and anisotropic are identical. Do we require one over the other? Or allow both?

luciansmith added a commit that referenced this issue Jul 25, 2022
@fbergmann
Copy link
Member

  • I think the array length is kind of pointless to begin with, but @fbergmann wanted it.

it was needed as long as we had compression in. If we still have compression in, i think we need it.

@fbergmann
Copy link
Member

@fbergmann : I think this is correct, but can you confirm? This is the 'isSpatial' attribute of a (core) Species.

yes, that works for me

@luciansmith
Copy link
Member Author

@fbergmann But the array length changes with whether or not compression is on. Right? Or is Matthias's suggestion what was supposed to happen in the first place, and the arrayLength should be the same whether or not compression is on?

@luciansmith
Copy link
Member Author

@matthiaskoenig : You wrote:

p44L6 “A value of “nearestNeighbor” means that the nearest lattice point value is to be returned.” => nearest makes only sense if a norm is defined (this should be most likely L2 Norm in cartesian coordinates). The norm should be stated.

This is beyond my knowledge, so I can't write a fix, but if you want to submit a fix, I'm happy to put it in.

However, I don't know what's confusing about 'return the nearest value'. Surely that's about as straightforward as you can get, barring points that are precisely equidistant between lattice points, and resolving that can be arbitrary? Or is that what the 'norm' covers?

@fbergmann
Copy link
Member

@fbergmann But the array length changes with whether or not compression is on. Right? Or is Matthias's suggestion what was supposed to happen in the first place, and the arrayLength should be the same whether or not compression is on?

ok, well in my implementations i used the length to pre-allocate the arrays and fill them after, which was more efficient. After reading, if the array was compressed, i'd uncompress it. So yes, changing it now would break things.

@luciansmith
Copy link
Member Author

@fbergmann Got it! @matthiaskoenig, does that make sense?

luciansmith added a commit that referenced this issue Jul 26, 2022
luciansmith added a commit that referenced this issue Jul 26, 2022
As per #381:  lead with what this *is* instead of what it isn't, and then also mention the UnitSId namespace.
@luciansmith
Copy link
Member Author

@fbergmann @jcschaff @matthiaskoenig: My inclination with Matthias's NaN question is to ignore it, since we ignore NaNs everywhere else in all other SBML specifications. It's a possible double value, it will mess with your math, if you don't want it to mess with your math, then don't use them.

Is there anything about Spatial in particular that would be worth calling out in the spec? Do either of your tools deal with NaNs in a special way?

@fbergmann
Copy link
Member

Is there anything about Spatial in particular that would be worth calling out in the spec? Do either of your tools deal with NaNs in a special way?

I agree with lucian here, computations with NaNs will not yield useful results in spatial simulations. But that is no different from analysis of other parts of SBML.

@luciansmith
Copy link
Member Author

@fbergmann @jcschaff @matthiaskoenig: I've done something for everything I know about, leaving only the following issues:

@matthiaskoenig: Do you have any issues with the changes as discussed above? (including 'nothing in the spec about NaNs, since none of the other specs talk about NaNs either')

@matthiaskoenig: I need to know what you would like for the 'norm' stuff, about nearest neighbors. New wording? More questions?

@fbergmann @jcschaff: I still need something about anisotropic/tensor/isotropic for 2D/1D geometries.

Thanks, everyone!

@matthiaskoenig
Copy link

@luciansmith Thanks for addressing the issues.

  • I am good with nothing about NaNs
  • I am good with leaving semicolons in
  • I am good with the length stuff due to compression ;)

I think the norm issue just requires a half sentence. "The nearest neighbor are calculated using L2 norm". This is what most people are doing and is a reasonable default. It should just be stated explicitly somewhere. The reason why this is important is that if you use a different norm you will possibly find different nearest neighbors and your interpolation will look different. As a consequence the results are not reproducible anymore.

luciansmith added a commit that referenced this issue Aug 8, 2022
Per #381:  Say that we calculate the nearest neighbor using the L2 norm:

"A value of “nearestNeighbor” means that the nearest lattice point value is to be returned, calculated using the L2 norm (or Euclidean distance)."
@luciansmith
Copy link
Member Author

OK, norm bit added with:

"A value of “nearestNeighbor” means that the nearest lattice point value is to be returned, calculated using the L2 norm (or Euclidean distance)."

This only leaves the 'what does isotropic mean in 2D' issue. I'm certainly happy to make up something, but I feel like we should hear from someone who's actually done some modeling with this ;-) I'll email the list!

luciansmith added a commit that referenced this issue Aug 15, 2022
@luciansmith
Copy link
Member Author

OK! A bit of new text for the isotropic/etc. thing:

For a two-dimensional \Geometry, the following are allowed:
\begin{itemize}
  \item A single isotropic diffusion coefficient.
  \item A single anisotropic diffusion coefficient, defined for the two axes (equivalent to a single isotropic diffusion coefficient).
  \item Up to two tensor diffusion coefficients, one per axis.
\end{itemize}

For a one-dimensional \Geometry, the following are allowed:
\begin{itemize}
  \item A single isotropic diffusion coefficient.
  \item A single tensor diffusion coefficient, defined for the single axis (equivalent to a single isotropic diffusion coefficient).
\end{itemize}

Anisotropic diffusion coefficients may not be defined for one-dimensional geometries, as they apply in two dimensions.

@luciansmith
Copy link
Member Author

With that, that's all issues addressed. Thanks, @matthiaskoenig ! I'll give this a week or so to gather any other comments, then re-submit to the SBML Editors.

@fbergmann
Copy link
Member

OK! A bit of new text for the isotropic/etc. thing:

I think you have tensor an anisotropic mixed up .. on 2d you have have 2 anisotropic ones, and 1 tensor ...

and on 1d you can have 1 anisotropic one but no tensor.

I also tried to write this up here https://sourceforge.net/p/sbml/mailman/message/37691260/

@matthiaskoenig
Copy link

Great.

The validation rule spatial-23404 must probably include parameters.
I.e. instead of

The value of the attribute spatial:variable of a DiffusionCoefficient object must be the

identifier of an existing Species object defined in the enclosing Model object.
it should be

The value of the attribute spatial:variable of a DiffusionCoefficient object must be the
identifier of an existing Species or Parameter object defined in the enclosing Model object.

Same for rule spatial-23457

Any Species may only have a single ...
=>
Any Species or Parameter may only have a single ...

and rules spatial-23504. Same in spatial-23551

In rule spatial-23551 species:variable -> spatial:variable

@luciansmith
Copy link
Member Author

Oh, looks like I got it wrong for the 3D explanation here, too. Will fix!

And you're right about the validation rules, too--the text in 3.10.1 says 'species or parameter' so the rules should, too.

@luciansmith
Copy link
Member Author

Fixed in the spec with 8b316fd and in libsbml with sbmlteam/libsbml#267

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants