-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OAI Identifier uses outdated validation for domain name #126
Comments
(I am cc-ing @zimeon and @phonedude) Well, indeed, it looks like something went a tad wrong with the definition of oai-identifier and, AFAIK, this is the first time the problem was brought up. My interpretation: The constructs related to domain names in the oai-identifier syntax definition of Section 2.1 of the OAI Identifier Format guideline build upon Section 3.2.2 of RFC2396, specifically these construction rules:
in which we can see that numericals are allowed in all positions of a component of a domain name with exception of the toplabel (TLD, e.g. org, com) in which only alphabeticals are allowed in the first position. But by using the following constructs:
the OAI Identifier Format guideline forbids numericals in the first position of all components of a domain name. I can't imagine this was the intention, and, if it was, I do not recall what the motivation could have been. Possible solutions:
|
Yes, I have no memory of this being by design. So I can only assume it was an oversight. |
Looks like an error to me. I also have no memory of an intention here. I agree that quietly adjusting the schema is relatively easy. However, this change would make it not match the guideline which is weird, so maybe we should edit the 20-year old guideline too?? |
Dear, I believe it was an evolution of the Internet. RFC2396 published in August 1998 has updated releases like RFC3986 in section 3.2 from January 2005. |
@zimeon, I think you’re right that the guideline should be updated too. I convinced myself when noticing that there’s a Document History section in which details of changes can be conveyed. I’m thinking that the only changes needed are in the construction rules (and document version, of course). I would prefer sticking to the reference to RFC2396 instead of more recent RFCs re URI syntax, because, after all, 2396 was the law of the land when oai-identifier was spec-ed and we’re merely correcting the spec, not creating a new one. |
This is done, thanks @ACz-UniBi ! |
The oai idenfier are usually generated using the system domain name as repository identifier, this lead for instance to identifier like that oai:dspace-cris.4science.cloud:e9ed438e-c7f7-4a18-95e5-3f635ea65fee
Unfortunately, the oai-identifier.xsd http://www.openarchives.org/OAI/2.0/oai-identifier.xsd , cached by the guidelines here
https://github.com/openaire/guidelines-cris-managers/blob/master/schemas/cached/oai-identifier.xsd#L36-L42
doesn't expect to have a number as first letter of a domain. This make the previous identifier invalid, but of course the domain dspace-cris.4science.cloud is perfectly valid
The text was updated successfully, but these errors were encountered: