Cache citation data in a file #164

nichtich · 2018-10-25T07:55:03Z

I though about using citation-js and Wikidata for reference management from command line. Given a list of Wikidata ids citation-js can lookup and convert to CSL-JSON, e.g.:

echo Q163335 > citekeys
echo Q3290152 >> citekeys
citation-js -o references -i citekeys

When added another key I don't want citation-js to download known items again. This can be done with some command line magic:

echo Q3020388 >> citekeys
{ cat citekeys & jq -r .[].id references.json; } | sort | uniq -u | citation-js >> references.json

A missing step is needed to combine the list of JSON array in references.json (or implement #163):

jq -s '[.[][]]' references.json > tmp; mv tmp references.json

Would it make sense to include this functionality in citation-js?:

citation --cache references.json < citekeys

The text was updated successfully, but these errors were encountered:

larsgw · 2018-10-25T08:26:23Z

Merging two parsed CSL-JSON files is fine (I'd propose using the original input, optionally saved in _graph, for diffing), I think. Merging two inputs where one or both are unparsed, without parsing, is less so, in a general case anyway. Implementing a special case for URLs or Wikidata IDs is possible, but goes against my ideas of trying to make things modular. However, perhaps the CLI should be exempt from that modularization...

nichtich · 2018-10-25T13:08:00Z

I'm only interested in CSL-JSON because that's all needed to create citations and bibliographies. If settings have changed and the original input is needed, one should better rebuild the full cache. Items should be identified by their citation key which is in the id field for items converted from Wikidata. I have not tried other importers but heuristics to merge same publications from different sources should be out of the scope of citation-js. Getting the same record from Wikidata via QID and from crossref via DOI would be two records in the cache.

larsgw · 2018-10-25T13:42:32Z

Getting the same record from Wikidata via QID and from crossref via DOI would be two records in the cache.

Agreed.

Items should be identified by their citation key which is in the id field for items converted from Wikidata.

Thing is, not every item with the same id has to have the same origin, and not every item with the same origin has to have the same ID. Sure, the Wikidata ID is in the id field now, but that might change, or someone might have a BibTeX file, maybe even exported from Citation.js, with a Wikidata ID in the label field. Because the original input is already saved in _graph (or should be, see #165), that seems like a better way to distinguish. That's what I meant, anyway.

I'm only interested in CSL-JSON because that's all needed to create citations and bibliographies.

But then the CLI magic would still be needed, because otherwise Citation.js would have to parse the entire citekeys file again, right? I'll work on #163 too, btw.

larsgw added api Requests for improvement to APIs enhancement Feature requests labels Oct 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache citation data in a file #164

Cache citation data in a file #164

nichtich commented Oct 25, 2018 •

edited

larsgw commented Oct 25, 2018

nichtich commented Oct 25, 2018

larsgw commented Oct 25, 2018

Cache citation data in a file #164

Cache citation data in a file #164

Comments

nichtich commented Oct 25, 2018 • edited

larsgw commented Oct 25, 2018

nichtich commented Oct 25, 2018

larsgw commented Oct 25, 2018

nichtich commented Oct 25, 2018 •

edited