Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the fact that DH JSON is a bare list and not compatible with LinkML tools as is #390

Closed
turbomam opened this issue Apr 10, 2023 · 2 comments · Fixed by #399
Closed

Comments

@turbomam
Copy link
Contributor

turbomam commented Apr 10, 2023

DH is welcome to add the DH JSON -> LinkML JSON (and vice versa) converters that I wrote

see

@ddooley
Copy link
Collaborator

ddooley commented Apr 11, 2023

Looking back on this, I think DH should input/output LinkML (JSON-LD) native JSON directly via browser, so need to understand the javascript required to do so. The existing "file -> Save as > .json" could be renamed to "file -> Save as > flat .json", and we could add a "file -> Save as > LinkML .json" option for the pure version. This avoids us having to use command line python tools as intermediary step.

@pkalita-lbl for comment.

(The LinkML data inlining options will come into play here later when we add 1-many data relations.)

@pkalita-lbl
Copy link
Collaborator

Let me see if I understand Mark's concern correctly. If I have a schema that implement's the typical LinkML container object pattern:

id: http://example.org/test
name: test
imports:
  - linkml:types
prefixes:
  linkml: https://w3id.org/linkml/

slots:
  s1:
    range: string
  s2:
    range: string
  entries:
    range: Entry
    multivalued: true

classes:
  Entry:
    slots:
      - s1
      - s2
  EntrySet:
    tree_root: true
    slots:
      - entries

I could point DataHarmonizer to the Entry class and it would show me an interface with two columns (for s1 and s2). I could enter some data and then export that data to JSON through the interface. It would look something like:

[
  {
    "s1": "row 1 col 1",
    "s2": "row 1 col 2",
  },
  {
    "s1": "row 2 col 1",
    "s2": "row 2 col 2",
  }
]

The issue is that I can't validate that file as-is using linkml-validate or using a generic JSON Schema validator and the JSON Schema derived from the LinkML schema. That's because LinkML doesn't really have a concept of an array at the root level -- hence the container object pattern.

So what Mark is saying is that if DataHarmonizer could somehow produce JSON that instead looks like:

{
  "entries": [
    {
      "s1": "row 1 col 1",
      "s2": "row 1 col 2",
    },
    {
      "s1": "row 2 col 1",
      "s2": "row 2 col 2",
    }
  ]
}

Now we have an object at the root level. That object corresponds to the EntrySet class in the schema and could be validated as such.

I don't have an exact proposal for how to resolve the situation, but it will probably involve a combination of logic to guess at the so-called container class and index slot (presumably via teaching DataHarmonizer to understand the tree_root metaslot), as well as ways to specify them manually (see also: https://linkml.io/linkml/data/csvs.html).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants