Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add url_post template parameter for remote cdx api #587

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kaij
Copy link
Contributor

@kaij kaij commented Oct 30, 2020

Description

This change provides a new parameter url_post to the template used to configure a remote index source. The url_post parameter includes the original url with POST data appended as, if available. The parameter __wb_post_data is extracted from the final url key.

Motivation and Context

This resolves a problem with communicating POST parameters to OutbackCDX. It fixes a replay issue described in #585. OutbackCDX has in the meantime been extended to index POST parameters as described in nla/outbackcdx#91. A sample config.yaml for OutbackCDX looks like this:

api_url: http://localhost:9596/nb-webarchive?closest={closest}&sort=closest&url={url_post}

I first tried using the existing alt_url. However, the alt_url is based on the original (unfiltered) url and it is not available for all calls. In order not to break existing solutions, I propose to use this new parameter.

Types of changes

  • Replay fix (fixes a replay specific issue)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added or updated tests to cover my changes.
  • All new and existing tests passed.

Still missing: tests (work in progress), documentation updates. Feedback and discussion welcome!

url_post is the original url with the POST data extracted
from the urlkey and appended as query parameter. Use with
outbackcdx instead of url parameter.
@ikreymer
Copy link
Member

ikreymer commented Dec 8, 2020

Thanks for adding this! Since OutbackCDX now supports __wb_post_data, I wonder if it would make sense to simply use that parameter to keep it simpler to maintain? For example, maybe JSON data could be prefixed with __wb_post_data=json:...?

And yes, would be very helpful to have some tests, let me know if you need any questions on where to add them!

@codecov
Copy link

codecov bot commented Dec 31, 2020

Codecov Report

Merging #587 (853eedc) into master (7b51101) will decrease coverage by 0.57%.
The diff coverage is 53.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #587      +/-   ##
==========================================
- Coverage   87.69%   87.11%   -0.58%     
==========================================
  Files          64       64              
  Lines        8096     8113      +17     
  Branches     1445     1447       +2     
==========================================
- Hits         7100     7068      -32     
- Misses        640      675      +35     
- Partials      356      370      +14     
Impacted Files Coverage Δ
pywb/warcserver/index/indexsource.py 80.74% <53.33%> (-0.92%) ⬇️
pywb/utils/geventserver.py 68.42% <0.00%> (-7.90%) ⬇️
pywb/rewrite/rewriteinputreq.py 71.87% <0.00%> (-6.25%) ⬇️
pywb/apps/frontendapp.py 82.85% <0.00%> (-5.15%) ⬇️
pywb/apps/rewriterapp.py 83.22% <0.00%> (-4.68%) ⬇️
pywb/utils/loaders.py 91.24% <0.00%> (-1.46%) ⬇️
pywb/recorder/multifilewarcwriter.py 76.83% <0.00%> (-1.13%) ⬇️
pywb/rewrite/url_rewriter.py 87.00% <0.00%> (-1.00%) ⬇️
pywb/warcserver/index/aggregator.py 91.47% <0.00%> (+1.55%) ⬆️
pywb/apps/static_handler.py 88.09% <0.00%> (+2.38%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7b51101...853eedc. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants