Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing ORCID ID for Author, with 2 Different Surname Spellings #118

Open
anthonymstohr opened this issue Oct 2, 2018 · 7 comments
Open

Comments

@anthonymstohr
Copy link

anthonymstohr commented Oct 2, 2018

Issue:

The lead author for this ChemKED file has her name spelled differently on her paper and her ORCID record (Mariam J. Al Rashidi vs. Mariam El Rachidi). Neither spelling works when trying to validate the file.

Input:

cyclopentane_rashidi_RCM_phi_0.5.zip

Expected behavior:

That it successfully validates the ChemKED file.

Actual behavior, including any error messages:

pyked.ChemKED("/home/astohr/Code/ChemKED-database/cyclopentane/cyclopentane_rashidi_RCM_phi_0.5.yaml")
/home/astohr/anaconda2/envs/pyteck/lib/python3.6/site-packages/pyked/validation.py:412: UserWarning: ORCID 0000-0001-7392-6777 missing for Mariam J. Al Rashidi
  warn('ORCID ' + orcid + ' missing for ' + author_match['name'])
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-2-ffdabfc2133a> in <module>()
----> 1 pyked.ChemKED("/home/astohr/Code/ChemKED-database/cyclopentane/cyclopentane_rashidi_RCM_phi_0.5.yaml")

This result only appears if using the author's name as written in the paper. Attempting to use the author's name as listed on her ORCID record, or including her ORCID ID in the ChemKED file, returns the same error as above, but without the user warning.

PyKED/ChemKED version: 4.1, Python version 3.6.6

@bryanwweber
Copy link
Member

Oh brother, names are hard. Thank you for reporting this. We'll have to think about how to resolve this. For now, you can disable validation of the file by passing skip_validation=True to the ChemKED constructor when you create the ChemKED instance.

Side note: Can you copy-paste from the terminal output rather than putting the image in the issue? Thanks!

@kyleniemeyer
Copy link
Member

If I understand correctly, the author's name differs between the ORCID record and the paper? That's a challenge, and a mixup a bit beyond our control.

If I recall correctly, depending on whether an author ORCID is provided, there are two behaviors:

  • Without an author ORCID, PyKED compares the name to that in the CrossRef record for the article.
  • With an author ORCID, PyKED checks that the author is listed for the article, and compares the name to that in the ORCID record.

When the names differ between the article and ORCID, I'm not really sure what to do. (This issue of names being hard to check and be unique is what ORCID is supposed to solve, but not everyone has one, nor do most historical articles have them associated.)

I did a quick check of the CrossRef record for the article, and this is what I get for the author in question:

{'ORCID': 'http://orcid.org/0000-0001-7392-6777',
  'authenticated-orcid': False,
  'given': 'Mariam J.',
  'family': 'Al Rashidi',
  'sequence': 'first',
  'affiliation': []}

@anthonymstohr can you confirm that when you have the author name entered as given there, with the ORCID, that you get an error?

@anthonymstohr
Copy link
Author

Thank you @bryanwweber

@kyleniemeyer Using the author name as given there, with the ORCID ID, gives an error. With the ORCID ID, the user warning does not appear as it does when one does not include her ORCID ID.

@kyleniemeyer
Copy link
Member

Ah, I see. So while the CrossRef record includes the ORCID, the name in that record does not match the name at ORCID. Very strange.

@anthonymstohr if you remove the ORCID, and enter the name as given just above from the CrossRef record, does that work without error? If not, we may need to reduce some of the ORCID checking, or somehow account for these weird cases.

@anthonymstohr
Copy link
Author

@kyleniemeyer
That also gives an error, just without the note included in the error message above. Using her name that is associated with the ORCID (from their records) also gives an error, without the user warning.

Tried a number of combinations of including the ORCID and various names (dropping the initial, Al vs. El, Rashidi vs. Rachidi) but none seem to work.

@bryanwweber
Copy link
Member

@anthonymstohr I would expect that if you used the name as written in the CrossRef API output (i.e., the output the @kyleniemeyer showed), and don't include the ORCID, then it should work, and should just give you a warning that this person has an ORCID and we recommend including it. Can you confirm that's the behavior that you get?

This appears to be a transliteration problem, where a journal has put the name one way, but the person prefers to spell it another way on their ORCID. I'm really not sure how we solve this problem. Maybe we just need to make all of this name checking to be warnings instead of Exceptions? That way, at least something like this isn't a showstopper, and I expect this will actually be a fairly common problem...

@anthonymstohr
Copy link
Author

Sorry for the delay; removing the ORCID and using the name shown above does indeed work. Thank you for your assistance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants