Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache citation data in a file #164

Open
nichtich opened this issue Oct 25, 2018 · 3 comments
Open

Cache citation data in a file #164

nichtich opened this issue Oct 25, 2018 · 3 comments
Labels
api Requests for improvement to APIs enhancement Feature requests

Comments

@nichtich
Copy link

nichtich commented Oct 25, 2018

I though about using citation-js and Wikidata for reference management from command line. Given a list of Wikidata ids citation-js can lookup and convert to CSL-JSON, e.g.:

echo Q163335 > citekeys
echo Q3290152 >> citekeys
citation-js -o references -i citekeys

When added another key I don't want citation-js to download known items again. This can be done with some command line magic:

echo Q3020388 >> citekeys
{ cat citekeys & jq -r .[].id references.json; } | sort | uniq -u | citation-js >> references.json

A missing step is needed to combine the list of JSON array in references.json (or implement #163):

jq -s '[.[][]]' references.json > tmp; mv tmp references.json

Would it make sense to include this functionality in citation-js?:

citation --cache references.json < citekeys
@larsgw
Copy link
Owner

larsgw commented Oct 25, 2018

Merging two parsed CSL-JSON files is fine (I'd propose using the original input, optionally saved in _graph, for diffing), I think. Merging two inputs where one or both are unparsed, without parsing, is less so, in a general case anyway. Implementing a special case for URLs or Wikidata IDs is possible, but goes against my ideas of trying to make things modular. However, perhaps the CLI should be exempt from that modularization...

@larsgw larsgw added api Requests for improvement to APIs enhancement Feature requests labels Oct 25, 2018
@nichtich
Copy link
Author

I'm only interested in CSL-JSON because that's all needed to create citations and bibliographies. If settings have changed and the original input is needed, one should better rebuild the full cache. Items should be identified by their citation key which is in the id field for items converted from Wikidata. I have not tried other importers but heuristics to merge same publications from different sources should be out of the scope of citation-js. Getting the same record from Wikidata via QID and from crossref via DOI would be two records in the cache.

@larsgw
Copy link
Owner

larsgw commented Oct 25, 2018

Getting the same record from Wikidata via QID and from crossref via DOI would be two records in the cache.

Agreed.

Items should be identified by their citation key which is in the id field for items converted from Wikidata.

Thing is, not every item with the same id has to have the same origin, and not every item with the same origin has to have the same ID. Sure, the Wikidata ID is in the id field now, but that might change, or someone might have a BibTeX file, maybe even exported from Citation.js, with a Wikidata ID in the label field. Because the original input is already saved in _graph (or should be, see #165), that seems like a better way to distinguish. That's what I meant, anyway.

I'm only interested in CSL-JSON because that's all needed to create citations and bibliographies.

But then the CLI magic would still be needed, because otherwise Citation.js would have to parse the entire citekeys file again, right? I'll work on #163 too, btw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Requests for improvement to APIs enhancement Feature requests
Projects
None yet
Development

No branches or pull requests

2 participants