Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Citeproc RS support #2220

Merged
merged 11 commits into from Dec 15, 2021
Merged

Citeproc RS support #2220

merged 11 commits into from Dec 15, 2021

Conversation

adomasven
Copy link
Member

@adomasven adomasven commented Oct 11, 2021

Adds experimental citeproc-rs support. Disabled behind a hidden cite.useCiteprocRs pref. @cormacrelf only published canary builds on NPM, so this isn't ideal in terms of NPM dependencies. But he has recently said that he'd be publishing a stable release. Also seems like my NPM got updated and this added this whole mess of a v2 package-lock file thing, which seems to be a transitory stage to a new format. Idk if we want that or not.

Related #1845

Comment on lines +46 to +51
// where citeproc_rs_wasm_include.js attaches some classes to Zotero.CiteprocRs.
// This works fine until we call resetDB() in the test runner (or any other time we
// could call ZoteroContext.reinit(). A usual require() call returns exported symbols
// from some Gecko script cache. Unfortunately that means that the necessary citeproc
// classes do not get reattached to the Zotero object and the Citeproc Wasm driver breaks
// when trying to create its own objects (e.g. Zotero.CiteprocRs.WasmResult).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything I can do to help avoid this? It's been a while since I came up with that solution, maybe worth a revisit if this is what's come of it. What happens in a reinit/resetDB?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the real solution would have been to provide a citeproc-rs commonjs build that was a little bit more backwards compatible with older runtimes, such that the custom Zotero build with the hacky attaching wouldn't have to be necessary. When I was loading citeproc-rs initially the modifications required to the include script were rather minimal and I think would have avoided this situation too. On the other hand this works in the meantime and we'll be moving to a newer version of Firefox in the intermediate future which will hopefully allow us to use the standard wasm bindings builds, so it will solve itself naturally.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does running the test suite detect the issue reliably? If so I can try the old way with your old patches back in, and see if it goes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add await resetDB() before https://github.com/adomasven/zotero/blob/9bc9f22fbcf1f04d04df9c0aa3313bb0bfc0aafa/test/tests/citeprocRsBridgeTest.js#L54 and then run runtest.sh -d 5 citeprocRsBridge which will break the test if you replace the loadSubScript() line above with the commented out require('citeproc_rs_wasm_include');

chrome/content/zotero/xpcom/integration.js Show resolved Hide resolved
Comment on lines +791 to +792
// Don't try to parse author names. We parse them in itemToCSLJSON
citeproc.opt.development_extensions.parse_names = false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually have a name parser as well, do you need an option to disable it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the person who added this change is gone from Zotero and there's no note for why citeproc name parsing was disabled, as there doesn't seem to be any obvious reason or conflicts that this would create. I also couldn't find anything on citeproc-js github when the option was added, nor in our issues. I don't think it's necessary now, unless perhaps @dstillman knows something here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't know. As long as the tests in 1cbd7f7 still pass, I guess we're good until someone tells us otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I guess it's about CSL-JSON output and uses beyond our built-in citation processing, right? Does the CSL spec call for specific name-parsing behavior, or was that just something that citeproc-js implemented? I assume the goal here was just generating CSL-JSON for consumers that might not have complex name parsers built-in.

Does disabling the name parser actually make a difference if the CSL-JSON is already pre-parsed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the CSL spec call for

No

Does disabling make a difference

There is a situation in which it would -- someone uses the "quoted name" method to avoid the name being parsed. The quotes are stripped, the name is slotted in the correct field, and then it would be parsed again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's not part of the CSL spec I'm not sure the processor should do it by default. Perhaps it should out of convention, but it should be clearly stated in the docs and yes, there should be an option to disable it.

scripts/babel-worker.js Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
@dstillman
Copy link
Member

Once we update this with the line-spacing fix, I think we can go ahead and merge this.

@dstillman
Copy link
Member

Should we merge this?

@adomasven
Copy link
Member Author

I will update citeproc-rs once more with the latest changes and then I don't see a reason not to.

@adomasven
Copy link
Member Author

Well, actually, we should wait for @cormacrelf to merge zotero/citeproc-rs#142 first

@cormacrelf
Copy link

Nearly there guys, just beating the GH actions into shape. I'm merging and then cutting 0.2.0 I think.

@cormacrelf
Copy link

Alright, 0.2.0 is out on NPM!

Release notes here with the 3 action items up top. https://github.com/zotero/citeproc-rs/releases/tag/wasm-v0.2.0.

@adomasven
Copy link
Member Author

Now ready to merge

@stakats
Copy link
Member

stakats commented Dec 15, 2021 via email

@dstillman dstillman merged commit 44b6cd0 into zotero:master Dec 15, 2021
@adomasven adomasven deleted the citeproc-rs branch October 18, 2022 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants