Should we include the title in the reference section? #55

bryanwweber · 2017-06-20T14:06:40Z

Title question raised by Mike Burke's group at Columbia.

bryanwweber · 2017-06-20T14:08:10Z

My thought is that we shouldn't for two reasons

It doesn't add anything that we don't already have from the DOI
Validating that the title is correct is likely to be prone to error due to various encoding issues and have a bunch of edge cases with conversion of non-ASCII characters in the response from the DOI server.

kyleniemeyer · 2017-06-20T14:59:21Z

Yes, I agree. I don't see too much benefit to including it. Though, this might lead to the question of whether we need *anything* beyond DOI... Also, what if the reference doesn't have a DOI? In that case it might be advised to add a title field, but none of the reference info will be checked anyway.

…

On Jun 20, 2017, at 7:08 AM, Bryan W. Weber ***@***.***> wrote: My thought is that we shouldn't for two reasons It doesn't add anything that we don't already have from the DOI Validating that the title is correct is likely to be prone to error due to various encoding issues and have a bunch of edge cases with conversion of non-ASCII characters in the response from the DOI server. @kyleniemeyer any thoughts? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

bryanwweber · 2017-06-20T16:24:46Z

Though, this might lead to the question of whether we need anything beyond DOI...

I think we need a full set of reference information, like would be published in a journal. Some (most?) journals don't include the title of the article in the references section. Also, including the authors field for the reference feels like giving credit where its due.

If the reference doesn't have a DOI... like for a report or something, maybe we should have a URL field? If the data isn't publicly available somehow, I don't think we should include it in the database at all. In either case, it still feels like the title field is redundant.

kyleniemeyer · 2017-06-20T18:00:35Z

I agree that we should probably only accept files when the reference is publicly available somewhere—I don't want to exclude conference papers that don't get turned into journal papers, though.

Thinking about this is leading to a chicken-and-egg problem in my head: ideally we want people (including us, or you at least) to create ChemKED files when they put a paper together, and perhaps include that as supplementary material with the submission. In that case, what do they put in the reference block? Just authors and a note about being under review? Perhaps the file-version should be 1.0alpha or something?

bryanwweber · 2017-06-20T18:47:39Z

I agree that we should probably only accept files when the reference is publicly available somewhere—I don't want to exclude conference papers that don't get turned into journal papers, though.

Does this include papers presented at, e.g. the US National Combustion Meetings, where the proceedings aren't published online? I'm inclined to not allow submissions of data from such meetings, because there's no way for someone who didn't attend to verify the data, and the data hasn't been peer-reviewed, which for all its faults, is still the minimum standard of acceptability.

I'm working on some files now for a paper; I'm putting the journal, year, and authors. Once its in-press, I'll add the DOI and submit it to the database. I'm not sure if I'll put the files in the supplementary material... If I do, I'll leave out the DOI (because I won't know it, I don't think), and I'll bump the file-version to 1 when I add the DOI and submit to the database. Then I'll bump it to 2 when I get a volume/issue/page.

kyleniemeyer · 2017-06-20T18:59:46Z

I think that if it came from a conference paper, at minimum the conference paper would need to be available on (e.g.) Figshare or something. I agree that we should prefer peer-reviewed data, but I also don't want to 100% exclude something potentially useful that didn't get published for some reason... not sure.

I'm working on some files now for a paper; I'm putting the journal, year, and authors. Once its in-press, I'll add the DOI and submit it to the database. I'm not sure if I'll put the files in the supplementary material... If I do, I'll leave out the DOI (because I won't know it, I don't think), and I'll bump the file-version to 1 when I add the DOI and submit to the database. Then I'll bump it to 2 when I get a volume/issue/page.

I definitely think we should encourage people to include the files as supplementary material, so that they are attached to the source paper. Not sure if you will have the DOI when it comes time to upload final materials for the paper, though.

bryanwweber · 2017-06-20T20:09:03Z

OK, perhaps the criteria is that it has to have a permanent identifier of some sort. But this discussion has gotten way off track (sorry, I got us off track 😃), and we should probably move the bits about the acceptability of data (or not) over to the ChemKED-database repo (and also write a wiki entry there on how to submit new data).

I think we agreed that title is not worth adding to the schema. If that's correct, feel free to close the issue (I just wanted to document the discussion for future reference).

kyleniemeyer · 2017-06-20T20:11:28Z

Yes, I agree we don't need to add it.

bryanwweber · 2017-06-22T19:14:22Z

From Mike Burke via email to Bryan:

With regard to the title, the value I see for having a title is that I can recognize what dataset it is by simply looking at the title rather than having to look up the paper based on the DOI. Could it simply be an optional item to specify? In my view, if one already specifies file authors, journal, etc., there seems little reason why a title would not be included.

bryanwweber · 2017-06-22T19:37:01Z

That's a reasonable use case. My concern is that validating that the title is correct (by comparing with the value from a DOI lookup) is bound to have many edge cases - for instance, some journals use HTML in their titles in the DOI service, while others don't. Having to code for all of these cases seems like it will lead to many false warnings.

The reason I'm insisting that we validate the title is correct is because we are trying, to the best of our ability, to ensure that we check that everything specified in the data file is correct according to some external standard. For instance, we also check the ORCID values for authors, if provided, to ensure the spelling of their names are correct, and we check the volume, issue, year, journal, and authors from a DOI lookup.

I'll look into testing this, picking say 100 random DOIs and seeing how accurate a relatively simple comparison will be. Reopening so I don't forget to do this.

bryanwweber · 2017-10-15T19:16:40Z

OK as I suspected, there are a number of differences in title formatting and such. However, it's not that difficult to print out a useful diff between the returned title and the title from the YAML, so I think this is workable. We might need to wait until #78 is resolved so that the diff can be shown to the user in a useful way.

bryanwweber added enhancement question labels Jun 20, 2017

kyleniemeyer closed this as completed Jun 20, 2017

bryanwweber mentioned this issue Jun 20, 2017

Add wiki page on how to submit new data and what data are acceptable pr-omethe-us/ChemKED-database#4

Closed

bryanwweber reopened this Jun 22, 2017

bryanwweber mentioned this issue Jul 25, 2017

How to handle references without DOIs? #69

Open

bryanwweber mentioned this issue Oct 15, 2017

Not for merging - Add titles test code #88

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we include the title in the reference section? #55

Should we include the title in the reference section? #55

bryanwweber commented Jun 20, 2017

bryanwweber commented Jun 20, 2017

kyleniemeyer commented Jun 20, 2017 via email

bryanwweber commented Jun 20, 2017

kyleniemeyer commented Jun 20, 2017

bryanwweber commented Jun 20, 2017 •

edited

Loading

kyleniemeyer commented Jun 20, 2017

bryanwweber commented Jun 20, 2017

kyleniemeyer commented Jun 20, 2017

bryanwweber commented Jun 22, 2017

bryanwweber commented Jun 22, 2017

bryanwweber commented Oct 15, 2017

Should we include the title in the reference section? #55

Should we include the title in the reference section? #55

Comments

bryanwweber commented Jun 20, 2017

bryanwweber commented Jun 20, 2017

kyleniemeyer commented Jun 20, 2017 via email

bryanwweber commented Jun 20, 2017

kyleniemeyer commented Jun 20, 2017

bryanwweber commented Jun 20, 2017 • edited Loading

kyleniemeyer commented Jun 20, 2017

bryanwweber commented Jun 20, 2017

kyleniemeyer commented Jun 20, 2017

bryanwweber commented Jun 22, 2017

bryanwweber commented Jun 22, 2017

bryanwweber commented Oct 15, 2017

bryanwweber commented Jun 20, 2017 •

edited

Loading