Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement id parsing in conjuction with $ref fixes #171

Merged
merged 1 commit into from
Dec 30, 2017

Conversation

johandorland
Copy link
Collaborator

This commit adds support for the id keyword and rewrites part of the code that deals with schema references. In conjuction with xeipuuv/gojsonreference#8 this fixes issues #134, #143, #168, #170, which were all related to schema references in one way or another.

@sqs
Copy link

sqs commented Dec 22, 2017

Thank you! I am not the maintainer, but I tried this out and it works for me.

It solves the problem I encountered where I wanted to avoid accessing the network to resolve referenced JSON schemas and instead provide them as strings. There may be an even better way to do it, but with this PR, I was able to use code like (simplifed):

type jsonLoader struct {
	gojsonschema.JSONLoader
}

func (l jsonLoader) LoaderFactory() gojsonschema.JSONLoaderFactory {
	return &jsonLoaderFactory{}
}

type jsonLoaderFactory struct {
	schemas map[string]string
}

func (f jsonLoaderFactory) New(source string) gojsonschema.JSONLoader {
	switch source {
	case "https://sourcegraph.com/v1/settings.schema.json":
		return gojsonschema.NewStringLoader(schema.SettingsSchemaJSON)
	case "https://sourcegraph.com/v1/site.schema.json":
		return gojsonschema.NewStringLoader(schema.SiteSchemaJSON)
	}
	return nil
}

And then use it in Validate like so:

	return gojsonschema.Validate(
		jsonLoader{gojsonschema.NewBytesLoader(schema)},
		gojsonschema.NewBytesLoader(input),
	)

The code above works with this PR and not on master.

@xeipuuv xeipuuv merged commit b7d5174 into xeipuuv:master Dec 30, 2017
mtrmac added a commit to mtrmac/image-spec that referenced this pull request Jan 10, 2018
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema-json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@johandorland johandorland deleted the idrefparsing branch January 16, 2018 20:50
mtrmac added a commit to mtrmac/image-spec that referenced this pull request Feb 6, 2018
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema-json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/image-spec that referenced this pull request Feb 6, 2018
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema.json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/image-spec that referenced this pull request Feb 6, 2018
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema.json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
dattgoswami9lk5g added a commit to dattgoswami9lk5g/bremlinr that referenced this pull request Jun 6, 2022
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema.json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
7c00d pushed a commit to 7c00d/J1nHyeockKim that referenced this pull request Jun 6, 2022
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema.json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
7c00d added a commit to 7c00d/J1nHyeockKim that referenced this pull request Jun 6, 2022
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema.json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
laventuraw added a commit to laventuraw/Kihara-tony0 that referenced this pull request Jun 6, 2022
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema.json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
sudo-bmitch pushed a commit to sudo-bmitch/image-spec that referenced this pull request Sep 25, 2022
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema.json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
tomalopbsr0tt added a commit to tomalopbsr0tt/fabiojosej that referenced this pull request Oct 6, 2022
After updating gojsonschema to include
xeipuuv/gojsonschema#171 , tests fail with
> unable to validate: Could not read schema from HTTP, response status is 404 Not Found

Before that gojsonschema change, "$ref" links were interpreted by taking
the current schema source file's URI as a base, and treating "$ref"
as relative to this.

For example, starting with the [file://]/image-manifest-schema.json
URI, as used by Validator.Validate (based on the "specs" map), the
>  "$ref": "content-descriptor.json"
reference used to evaluate to file:///content-descriptor.json.
gojsonschema.jsonReferenceLoader would then load these file:///*.json
URIs via _escFS.

After the gojsonschema change, "$ref" links are evaluated relative to
a URI base specified by the "id" attribute inside the schema source,
regardless of the "external" URI passed to the gojsonschema.JSONLoader.

This is consistent with
http://json-schema.org/latest/json-schema-core.html#rfc.section.8 and
http://json-schema.org/latest/json-schema-core.html#rfc.section.9.2
(apart from the "id" vs. "$id" attribute name).

In the same example, [file://]/image-manifest-schema.json URI contains
>  "id": "https://opencontainers.org/schema/image/manifest",
so the same
>  "$ref": "content-descriptor.json"
now evaluates to
"https://opencontainers.org/schema/image/content-descriptor.json",
which is not found by gojsonschema.jsonReferenceLoader (it uses
_escFS only for file:/// URIs), resulting in the 404 quoted above.

This is a minimal fix, making the schema files available to
gojsonschema at the https:// URIs, while continuing to read them from
_escFS.

Because gojsonschema.jsonReferenceLoader can only use the provided fs
for file:/// URIs, we are forced to implement our own
gojsonschema.JSONLoaderFactory and gojsonschema.JSONLoader; something
like this might be more generally useful and should therefore instead
be provided by the gojsonschema library.

This particular JSONLoader{Factory,} implementation, though, is
image-spec specific because it locally works around various
inconsistencies in the image-spec JSON schemas, and thus is not suitable
for gojsonschema as is.

Namely, the specs/*.json schema files use URIs with two URI path prefixes,
https://opencontainers.org/schema/{,image/}
in the top-level "id" attributes, and the nested "id" attributes along
with "$ref" references use _several more_ URI path prefixes, e.g.
>       "id": "https://opencontainers.org/schema/image/manifest/annotations",
>      "$ref": "defs-descriptor.json#/definitions/annotations"
in image-manifest-schema.json specifies the
https://opencontainers.org/schema/image/manifest/defs-descriptor.json
URI.

In fact, defs-descriptor.json references use all of the following URIs:
> https://opencontainers.org/schema/defs-descriptor.json
> https://opencontainers.org/schema/image/defs-descriptor.json
> https://opencontainers.org/schema/image/descriptor/defs-descriptor.json
> https://opencontainers.org/schema/image/index/defs-descriptor.json
> https://opencontainers.org/schema/image/manifest/defs-descriptor.json

So, this commit introduces a loader which preserves the original _escFS
layout by recognizing and stripping all of these prefixes, and using
the same /*.json paths for _escFS lookups as before; this is clearly
unsuitable for gojsonschema inclusion.

Finally, the reason this commit uses such a fairly hacky loader is that merely
changing the _escFS structure is still not sufficient to get consistent
schema: the schema/*.json paths in this repository, and the "$ref" values,
do not match the "id" values inside the schemas at all.  E.g.
image-manifest-schema.json refers to
https://opencontainers.org/schema/image/manifest/content-descriptor.json ,
while content-descriptor.json identifies itself as
https://opencontainers.org/schema/descriptor , matching neither the path prefix
nor the file name.

Overall, it is completely unclear to me which of the URIs is the canonical URI
of the "content descriptor" schema, and the owner of the URI namespace
needs to decide on the canonical schema URIs.  Only afterwards can the
code be cleanly modified to match the specification; until then, this
commit at least keeps the tests passing, and the validator usable
by external callers who want to use the public
image-spec/schema.ValidateMediaType*.Validate() API.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants