Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from internetarchive:master #564

Open
wants to merge 5,815 commits into
base: master
Choose a base branch
from

Conversation

pull[bot]
Copy link

@pull pull bot commented Apr 16, 2021

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot added the ⤵️ pull label Apr 16, 2021
@mekarpeles mekarpeles force-pushed the master branch 4 times, most recently from 5049550 to 0ce6f82 Compare October 6, 2023 18:16
@RayBB RayBB force-pushed the master branch 2 times, most recently from c0ed3b5 to e383c1d Compare October 6, 2023 19:33
jimchamp and others added 23 commits April 23, 2024 22:26
The Amazon Products API used by the affiliate server can return product
information for Amazon-specific ASINs that start with `B`. This commit
makes changes sufficient to allow `/isbn` to support "ISBNs" (i.e.
Amazon-specific ASINs) that start with `B`.

The high level description of how this works is that the validation has
been modified all through the pipline to allow `B` ASINs, from `/isbn`
on through to the validation for importing items from Amazon.
Sometimes promise items with non-ISBN ASINS (e.g. ASINs that start with
`B`) don't have the most fulsome metadata. This commit causes such
promise items to look to the affiliate server to supplement their data.

When promise items are processed by `scripts/promise_batch_imports.py`,
any promise items with such non-ISBN ASINs will make a request to the
affiliate server ("BookWorm") to `stage` the items for import.

Then, when the promise item eventually hits `load()`, it will check the
`import_item` table for a matching record. If a match is found, that
metadata is added to *empty* fields in the promise item--no promise item
metadata is overwritten.
When visiting `/api/books.json?bibkeys=bibkey&high_priority=true`, if
a bibkey is a non-ISBN ASIN (i.e. one starting with `B`), then the code
will check there's metadata for matching `staged` `B*` bibkey, and if
so, it will trigger a import/reimport, which will either:
1. create a new edition, work, etc., based on BookWorm metadata, or;
2. match the existing edition, etc., and supplement the metadata with
   BookWorm metadata for any emty fields in the original record.

See #9030.
The upshot of this commit is that it's now easier to import functions,
such as `do_import()` from `manage-imports.py`.

However, it will also break cron. At the very least the cron to "Add new
scans of yesterday to import queue" on `ol-home0` will need updating
with something like:
`30 4 * * * PYTHONPATH=/openlibrary $PYTHON /openlibrary/scripts..etc`
…tion

Add permissions for labeling issues
Allow `issue_comment_bot.py` to run without updating issues or publishing Slack digest
These classes are being moved prior to code modification so code
modifications are easier to see in the `diff`.

In the next commit, functions defined before `PrioritizedISBN` will be
modified to rely on `PrioritizedISBN`, and `PrioritizedISBN` itself will
be modified and renamed.
This commit adds a URL parameter to BookWorm's `/isbn` endpoint such
that adding `?stage_import=false` will stop the result from being staged
for import. This setting is likely only useful if `high_priority=true`,
as the result, if any, will be returned by the endpoint, and to see the
result high_priority=true.

Note: the exception to the above is if the result is already cached.
Then the endpoint will return the result as well.
This takes advantage of
`/isbn/<asin>?high_priority=true&stage_import=false` on BookWorm to
return metadata without staging anything for import so that the new
metadata can supplement the import record associated with the `B*` ASIN
(e.g. BWB promise item).

The strategy this uses is quite slow, as it goes through the existing
`_get_amazon_metadata`, which is not `async`, and therefore each request
for `B*` metadata takes at least a second. This may need updating in
the future.

Another alternative would be to simply stage the BookWorm records for
import, and update them in `load()`, after first staging them here in
`scripts/promise_batch_imports`
This commit (hopefully) clears up some of the nomenclature around ASINs,
ISBN 10s, and ISBN 13s.

The solution is to simply create whatever is available of `b_asin`,
`isbn_10`, and `isbn_13`, where `b_asin` is a `B*` ASIN from Amazon.

Then, a variable `key` is introduced, which is used for querying Amazon
via BookWorm, and it is either the value of `isbn_10`, or `b_asin` if
`isbn_10` doesn't exist.
This commit moves the metadata supplementing of `B*` ASIN records back
to `load()`.

This makes the code slightly more clean, centralizes the import logic,
and avoids the need to either do slow, non-async http GETs to BookWorm
in `promise_batch_imports` to supplement each record before staging, or
to substantially modify code to use async requests.
…er-look-up-non-isbn-10-asins

Augment non-ISBN ASIN BWB records with BookWorm data
Co-authored-by: Drini Cami <cdrini@gmail.com>
jimchamp and others added 30 commits May 31, 2024 15:41
…nt-changes

Hide user preference changes in /recentchanges
…modal

Added a QR icon in Share Modal that opens to QR code on click
fix typo on Tagalog, tgi should be tgl
* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.4.4 → v0.4.5](astral-sh/ruff-pre-commit@v0.4.4...v0.4.5)
- [github.com/codespell-project/codespell: v2.2.6 → v2.3.0](codespell-project/codespell@v2.2.6...v2.3.0)
* fix socioeconomic
* fix ruff type annotations
* ignore thirdparty

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: RayBB <RayBB@users.noreply.github.com>
Re-balance JS for smaller all.js (444 kb -> 254 kb!)
* Added Followers count to /stats
* Updates .pot
---------
Co-authored-by: anrawool <anrawool@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.4.5 → v0.4.7](astral-sh/ruff-pre-commit@v0.4.5...v0.4.7)
- [github.com/pre-commit/mirrors-eslint: v9.3.0 → v9.4.0](pre-commit/mirrors-eslint@v9.3.0...v9.4.0)
There have been a few changes to the shape of their data
…nfig

[pre-commit.ci] pre-commit autoupdate
Add access key for standard ebooks OPDS feed
Change new My Lists showcase style rules
Fetch `/collections` pages from DB, not cache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet