Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TooManyRedirects with PyPI json API #214

Open
jayvdb opened this issue Jan 24, 2020 · 0 comments
Open

TooManyRedirects with PyPI json API #214

jayvdb opened this issue Jan 24, 2020 · 0 comments

Comments

@jayvdb
Copy link
Contributor

jayvdb commented Jan 24, 2020

I've been working with a cache of PyPI json records for a while, and have two resources which now causes TooManyRedirects because of PyPI package normalisation.

One is https://pypi.org/project/django-coverage-plugin/

The JSON is at

https://pypi.org/pypi/django-coverage-plugin/json

However I am occasionally using django_coverage_plugin

i.e. django_coverage_plugin which redirects to https://pypi.org/pypi/django-coverage-plugin/json , for which I have a cache entry eb708b277cfec19dff1c796663031b09f5fc8ba511d43b56dad8fcc5 created today:

cc=4,��response��body��<html>
 <head>
  <title>301 Moved Permanently</title>
 </head>
 <body>
  <h1>301 Moved Permanently</h1>
  The resource has been moved to /pypi/django-coverage-plugin/json; you should be redirected automatically.


 </body>
</html>�headers��Connection�keep-alive�Content-Length�230�Access-Control-Allow-Headers�MContent-Type, If-Match, If-Modified-Since, If-None-Match, If-Unmodified-Since�Access-Control-Allow-Methods�GET�Access-Control-Allow-Origin�*�Access-Control-Expose-Headers�X-PyPI-Last-Serial�Access-Control-Max-Age�86400�Cache-Control�max-age=900, public�Content-Security-Policy�Wbase-uri 'self'; block-all-mixed-content; connect-src 'self' https://api.github.com/repos/ *.fastly-insights.com sentry.io https://api.pwnedpasswords.com https://2p66nmmycsj3.statuspage.io; default-src 'none'; font-src 'self' fonts.gstatic.com; form-action 'self'; frame-ancestors 'none'; frame-src 'none'; img-src 'self' https://warehouse-camo.cmh1.psfhosted.org/ www.google-analytics.com *.fastly-insights.com; script-src 'self' www.googletagmanager.com www.google-analytics.com *.fastly-insights.com https://cdn.ravenjs.com; style-src 'self' fonts.googleapis.com; worker-src *.fastly-insights.com�Content-Type�text/html; charset=UTF-8�Location�1https://pypi.org/pypi/django-coverage-plugin/json�Referrer-Policy�origin-when-cross-origin�Server�nginx/1.13.9�Accept-Ranges�bytes�Date�Fri, 17 Jan 2020 03:56:29 GMT�X-Served-By�%cache-iad2133-IAD, cache-sin18040-SIN�X-Cache�HIT, MISS�X-Cache-Hits�1, 0�X-Timer�S1579233390.755222,VS0,VE228�Vary�Accept-Encoding�Strict-Transport-Security�,max-age=31536000; includeSubDomains; preload�X-Frame-Options�deny�X-XSS-Protection�1; mode=block�X-Content-Type-Options�nosniff�!X-Permitted-Cross-Domain-Policies�none�status�-�version
�reason�Moved Permanently�strict�decode_content¤vary��Accept-Encoding�gzip, deflate

But I also have an older cache entry 17e4bde404ebd1e71cb2a45d038b6c02900991906af7a0956d110822

cc=4,��response��body��<html>
 <head>
  <title>301 Moved Permanently</title>
 </head>
 <body>
  <h1>301 Moved Permanently</h1>
  The resource has been moved to /pypi/django_coverage_plugin/json; you should be redirected automatically.


 </body>
</html>�headers��Connection�keep-alive�Content-Length�230�Access-Control-Allow-Headers�MContent-Type, If-Match, If-Modified-Since, If-None-Match, If-Unmodified-Since�Access-Control-Allow-Methods�GET�Access-Control-Allow-Origin�*�Access-Control-Expose-Headers�X-PyPI-Last-Serial�Access-Control-Max-Age�86400�Cache-Control�max-age=900, public�Content-Security-Policy�8base-uri 'self'; block-all-mixed-content; connect-src 'self' https://api.github.com/repos/ *.fastly-insights.com sentry.io https://2p66nmmycsj3.statuspage.io; default-src 'none'; font-src 'self' fonts.gstatic.com; form-action 'self'; frame-ancestors 'none'; frame-src 'none'; img-src 'self' https://warehouse-camo.cmh1.psfhosted.org/ www.google-analytics.com *.fastly-insights.com; script-src 'self' www.googletagmanager.com www.google-analytics.com *.fastly-insights.com https://cdn.ravenjs.com; style-src 'self' fonts.googleapis.com; worker-src *.fastly-insights.com�Content-Type�text/html; charset=UTF-8�Location�1https://pypi.org/pypi/django_coverage_plugin/json�Referrer-Policy�origin-when-cross-origin�Server�nginx/1.13.9�Accept-Ranges�bytes�Date�Sat, 11 Jan 2020 12:42:12 GMT�X-Served-By�%cache-iad2137-IAD, cache-sin18050-SIN�X-Cache�HIT, MISS�X-Cache-Hits�1, 0�X-Timer�S1578746532.880195,VS0,VE228�Vary�Accept-Encoding�Strict-Transport-Security�,max-age=31536000; includeSubDomains; preload�X-Frame-Options�deny�X-XSS-Protection�1; mode=block�X-Content-Type-Options�nosniff�!X-Permitted-Cross-Domain-Policies�none�status�-�version
                                                                                                                                                                 �reason�Moved Permanently�strict�decode_content¤vary��Accept-Encoding�gzip, deflate

The other package I encountered this with is an old record for jupyter-console redirecting to jupyter_console and a new jupyter_console entry for a redirect to back to jupyter-console.

Maybe the backend is changing whether it uses the normalised name or the literal name in the json path name, or it depends on the version of the uploader, or maybe it was gamma rays. I havent found any consistency yet.

https://pypi.org/pypi/setuptools_scm/json redirects _ to - but
https://pypi.org/pypi/backports.ssl_match_hostname/json does not.

This is likely to be a problem for pip if it uses the JSON api.

Anyways, as these cycles are occurring between cache entries which are not being refreshed, repeating the loop multiple times is a bit silly (default requests redirect max is 30). Seems like it would be appropriate to detect the cycle early and possibly invalid the cache entries so the server can resolve the problem, or at least re-affirm the problem still exists. This could also be solved by invalidating any redirect cache entry if it is older than any of the redirect cache entries encountered whilst handling the current request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant