Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Citations #20

Open
jacobwindsor opened this issue Apr 26, 2017 · 17 comments
Open

Citations #20

jacobwindsor opened this issue Apr 26, 2017 · 17 comments

Comments

@jacobwindsor
Copy link
Owner

jacobwindsor commented Apr 26, 2017

Something that would be extremely useful for these interactive descriptions is Wikipedia style citations. They should look something like this [1] with a reference list at the bottom.

This should be fairly easily achievable with a Showdown extension in a similar vein to Kaavio Showdown and @larsgw's citation.js. P.S: @larsgw, I'm putting this issue here but ultimately it is an extension to WikiPathways.

The only issue with this method is that the descriptions do not belong to the GPML, they just reference some GPML via a WikiPathways ID. The GPML can have some references and it's quite likely that many references will be shared among the GPML and the description. @egonw any thoughts?

@egonw
Copy link
Collaborator

egonw commented Apr 27, 2017

Not entirely following you... where does the GPML come in?

@larsgw
Copy link

larsgw commented Apr 27, 2017

I looked into Showdown extensions, but unfortunately they don't support async yet (see showdownjs/showdown#322). Currently, Citation.js doesn't support async either (and loads resources synchronously), but I'm planning to change that (see larsgw/citation.js#29). I could write an extension with sync, or I could look into PR'ing Showdown, and I'll probably do both.

@jacobwindsor
Copy link
Owner Author

@larsgw why do you need to do it asynchronously?

@larsgw
Copy link

larsgw commented Apr 28, 2017

e.g. to fetch metadata from an article where only the wikidata ID is given in the markdown. I currently use sync-request, but synchronous requests are strongly discouraged (for good reasons), so I would like to make it work asynchronously.

@jacobwindsor
Copy link
Owner Author

@egonw I mean the the references that may occur in the Biopax element. I'd imagine there would be a lot of overlap between the references in the description and in there. It's not a big problem I suppose but it's a little confusing to have two bibliographies - one for the references in the GPML and one for the references in the description.

@jacobwindsor
Copy link
Owner Author

jacobwindsor commented Apr 28, 2017

@larsgw ah I see. I can imagine that's going to be a large PR in Showdown. You don't really need an async extension though - see this workaround in markdown-it.

With Kaavio Showdown, I see it more like this.

  1. Create a custom method using citation.js to get the required citation information from a string like ^[some citation]. This method will replace it with the required data for Showdown so you get something like cite[author, year, title]
  2. Add a custom extension to Showdown to replace the cite[author, year, title] with the Wikipedia style reference, adding everything to the bibliography at the end.

The custom method could be used before calling makeHTML() or makeHTML could just be overrided with the custom logic.

It's not as nice as having an async extension in Showdown itself, but hey it works :)

@larsgw
Copy link

larsgw commented Apr 28, 2017

I was already planning to make a small tool replacing annotated HTML elements with references, something like this (cjs is the DOM abbr I use for Citation.js):

<div class="cjs-replace" data-input="..." data-option-output-style="..." etc></div>
<script src="path/to/said/tool.js"></script>

That would come in handy when used in web pages: simply put in one HTML element per reference, excluding script element, and you're done.

Using that with a showdown plugin that replaces ^[some citation] and/or cite[author, year, title] would work, but I'll have to look into not making synchronous use too complex for what it is. Perhaps something like this:

  • input is in the forms ^[<DIRECT CITE INPUT>] or ^[<AUTHOR>, <YEAR>, <TITLE>, ...]; the latter would first be converted to a Cite object in the next step
  • the to-be-written markdown plugin would convert sync to direct output, and async to the HTML tags
  • if you use the HTML tool with sync output (i.e. already complete), nothing would happen; if you use it with async output (i.e. annotated HTML), it would replace it with complete output; if you have async output, but you don't use the tool, it would simply output a link to the identifier (<a href="https://wikidata/wiki/Q1" ...>Q1</a>)

As a note: the last bullet point would require the annotated HTML to already be that link, as there is no client side code to change that, but that shouldn't be a problem, and it would provide a (somewhat) nice thing to look at while waiting for https://wikidata.org/entity/Q1.json to load.

Providing output options would be difficult, but the only real option you would be missing out on is the citation style. Maybe I should just come up with a standard one, that everyone agrees with. Jokes aside, it is an issue. The only solutions that come to mind are document-wide declaration (un-markdown-y, and not self-contained ^[...] statements) or repeating the option everywhere (annoying as well), so I'll look into this later.

Anyway, that's more related to the general concept of the plugin, not how it would work asynchronously. The setup above still stands, and this way, it would degrade gracefully, and both tools would be able to be used independently.

@jacobwindsor
Copy link
Owner Author

Awesome!

I'm not sure I'm 100% following you but is this correct?

  1. User inputs markdown like ^[<DIRECT CITE INPUT>] or ^[<AUTHOR>, <YEAR>, <TITLE>, ...]
  2. This is then parsed to a cjs element
  3. If the user uses a direct cite input, CJS gets the relevent information async
  4. If the user uses author, year, title input then CJS just renders the correct citation

If this is the case then I see one issue with it. In most cases the dev (including me) would like to know when all of the citations are finished rendering. With this case, I assume the CJS elements are separate CJS instances. You can of course display something like [loading citation...] while the citation loads but that's not so nice.

I don't really understand what you mean by output options.

P.S I'd use a <cjs author="..." year="..." title="..." wikidata="..."></cjs> custom element. As a note to the loading you could then give it a required manager attribute which will register the element to the manager, taking control of loading and possibly other things. I'm not sure if that's good practise but it's the first thing I thought of

@larsgw
Copy link

larsgw commented Apr 29, 2017

It's more like this (sorry for the overkill):

cjs-tool-showdown

<async?> is about if asynchronous operations need to be done, not if it is available. In the diagram, unavailability of async is assumed.

A callback in the HTML tool would be possible (find all elements to be changed; make a boolean array with a false value for each element; when one is loaded, switch its boolean and check if all are now true). This can of course, be built into the tool itself.

With output options I mean the output options for Cite, i.e. language; 'real' or string representation of the following; HTML, plain text or JSON; last but not least, output style, which can be CSL-JSON, BibTeX, 'BibTeX JSON' (parsed BibTeX) or formatted citations with a provided citation style. Only the citation style and possibly the language are relevant in this case, I think.

I'd love to use custom elements, but it seems a lot of trouble with browser support, polyfills, possibly polyfills for the polyfills, licences for polyfills and everything related. I'll have a look at it.

@jacobwindsor
Copy link
Owner Author

jacobwindsor commented Apr 29, 2017

Not overkill at all, it's good to see a diagram sometimes.

I'm wondering why you're bothering with sync operations at all. It would simplify the API a lot if everything returned a promise/observable regardless of whether an HTTP fetch had to be called. If i'm barking down the wrong tree then please say.

I don't know exactly what your API is like but say you have a cite method like so:

cjs.cite(options: {wikidata?, doi?, author?, year?,...}): Promise<any> {
    let data;
    if (options.wikidata || options.doi) {
        // Do the fetch stuff here
        data = {<some object with Cite data>}
    else {
        // Just build the data from author, year, title etc.
        data = {<some object with Cite data>}
    }
    return new Promise((resolve,reject) => resolve(data));
});

It's much easier to work with when you know the response is only one type. The client doesn't have to care about whether data is being fetched via HTTP or not. It always returns a promise no matter what.

Custom elements are really nice, we actually use them in Pvjs. If you follow the guide from Google I linked to it's pretty straightforward. The available polyfills are pretty good and you don't need to worry about super old browsers IMO.

With custom elements, this is how I see it. The custom element uses the aforementioned cite method (or whichever method is in your API). Until the promise is resolved there will be some kind of loading logic. When the promise has resolved, just set the inner HTML of the element to the resolved data.

Including Markdown you get this flow (sorry but I'm too lazy to make a nice diagram):

^[direct cite] OR ^[author, year, title]
                      |
                      |
                      V
<cjs bib="<selector for bibliography>" doi="<if given>" wikidata="<if given>" title="<if given>" author="<if given>" year="<if given>"></cjs>
                      |
                      |
                      V
Custom element uses the cjs cite method to either grab the data from another web service or just build it up from the author, year, and title.
                       |
                       |
                       V
<cjs doi="<if given>" wikidata="<if given>" title="<if given>" author="<if given>" year="<if given>">
    <p>[<a href="#linktoref">1</a>]</p>
    // Also fill in the bibliography at the end. With the bibliography selector
</cjs>

Let me know your thoughts on this flow. I think it's more straightforward but you know how it works at the moment so I could easily have overlooked things.

I'm not sure what the HTML Tool refers to, is it the tool that create the <div class="cjs"...></div>?

@jacobwindsor
Copy link
Owner Author

You can see the very minimal amount of polyfills used in Pvjs here. The setup for the custom element is here.

I should note that it hasn't been cross browser tested yet.

@larsgw
Copy link

larsgw commented Apr 30, 2017

First version done. HTML handling isn't as nice as what it is going to be, but that's partly because there's no option to amend HTML in both citeproc-js and Citation.js, or at least not the way I'm looking for. Links don't work yet, and sorting and shown text isn't really finished either, but I'm working on it.

It should work simply by requireing the module. No values need to be assigned (which I don't like, but that's how the boilerplate worked; I'll have a look at it later). Example code (not es6, just quick code):

var Showdown = require('showdown')

require('citation-js-showdown') // registers extension

var converter = new Showdown.Converter({ extensions: ['citation.js'] })

var text = 'test ^[Q1,Q27795847] test'
var html = converter.makeHtml(text)

console.log(html)

Output is like this:

<p>test <sup>[1]</sup> <sup>[2]</sup> test</p>
<div class="csl-bib-body">
  <div data-csl-entry-id="1" class="csl-entry">universe. (n.d.).</div>
<br />
  <div data-csl-entry-id="2" class="csl-entry">Wohlgemuth, G., Mehta, S. S., Mejia, R. F., Neumann, S., Pedrosa, D., Pluskal, T., … Fiehn, O. SPLASH, a hashed identifier for mass spectra. <i>Nature Biotechnology</i>, <i>34</i>, 1099–1101. https://doi.org/10.1038/NBT.3689</div>
<p></div></p>

The weird HTML probably has to do with interferences with other rules, I'll have a look at that too later. As you can see, the IDs aren't shown in the references, but they do with a simple CSS rule:

.csl-entry:before {
  content: '[' attr(data-csl-entry-id) ']: '
}

Again, first version. I already had to change Wikidata parsing (larsgw/citation.js#35) to push the new version with the new exposed parsing methods which I needed for this code. Fixing all this will need changes to Citation.js that aren't even officially on the roadmap yet, so that may take a while, depending on how much other stuff I have to do.

@larsgw
Copy link

larsgw commented Apr 30, 2017

I forgot to mention: Citation.js doesn't actually support async yet, so I didn't have to go through all that trouble in this version. Custom elements would be a part of the HTML tool (the one that replaces the <cjs> with citations), so that's why that's not included yet.

@jacobwindsor
Copy link
Owner Author

jacobwindsor commented May 4, 2017

Brilliant work!

My only comment is to unit test it. Given that it's so easy to unit test (just provide an md and html file) it's well worth it. I found quite some unexpected bugs by introducing them.

Re: weird HTML - if it is to do with interference with other rules then I would just change the syntax. Something like cite[input] is fine too.

@jacobwindsor
Copy link
Owner Author

In terms of integrating this into MetabMaster - I would like to give it a try at least. I'm a bit cautious about using it because of the sync requests in Citation.js. I would like to creating a citation-dev branch and possibly serve that via staging.metabmaster.jcbwndsr.com .

If you have time than please feel free to create a PR. I'm currently writing my thesis so not sure I'll have time for a little while. I would like to include this feature in there though and it is definitely something that should be included in WikiPathways if and when that happens

@larsgw
Copy link

larsgw commented May 4, 2017

Re: Unit testing - definitely. Publishing as v1.0 was an accident. I'm aware of the problems (well, at least some of them).
Re: Re: weird HTML - I think it's the output <div>s, <br>s and \ns conflicting, I'll look at that too.
Re: sync - I'm working on the todo-list at Citation.js to try to get to async as fast as possible, but it'll take a while, especially now holiday is almost over.
Re: PR - I'll take a look at it

@larsgw
Copy link

larsgw commented May 5, 2017

What issues are you having? I'm working on tests and improvements, so it would be nice to know if I missed something. The part conflicting is the divs outputted by a third-party library (in string format), so it's hard to do something about that.

EDIT: The actual conflict was caused by showdown, and is now fixed. I raised an issue: showdownjs/showdown#387.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants