Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async and sync compatible wrapper #1036

Draft
wants to merge 50 commits into
base: main
Choose a base branch
from

Conversation

vishes-shell
Copy link

@vishes-shell vishes-shell commented Mar 4, 2020

I've rebuild wrapper to support async and sync interface with support of httpx.

There is two gitlab clients: Gitlab and AsyncGitlab, initerfaces are the same, except most of the methods would return awaitable object in AsyncGitlab.

Related #1025

vishes-shell and others added 30 commits February 22, 2020 15:35
Make all functions async
Use httpx instead of httpx
Make decorators async to correctly wrap async methods
Change _http_auth on direct httpx.AsyncClient.auth property
Fix some errors that are met on the way to async
This possibly is temporary solution because httpx raises TypeError when
tries to init DataField with bool value
on_http_error should work with sync functions that return coroutines and
handle errors when coroutines will be awaited
Also provide async and sync interface to interact with GitlabList
Provide base of gitlab client along with implementations of sync and
async interface
Basic principle is that we won't use constructor to create object,
instead classmethod would be used that return either object or corotine
that returns object
@vishes-shell
Copy link
Author

vishes-shell commented Mar 5, 2020

Here is some sort of notable changes:

  1. response_content no longer support chunk_size since httpx does not support this. (Allow setting chunk_size for Response.iter_bytes() etc... encode/httpx#394)
  2. gitlab clients are moved to client.py
  3. 3 implementations of gitlab clients: gitlab.BaseGitlab abstract implementation that share common methods and store all docsrings, gitlab.Gitlab is sync implementation with httpx.Client, gitlab.AsyncGitlab is async implementation with httpx.AsyncClient.
  4. on_http_error make nested since we can get coroutine and need to catch same error in awaiting response
  5. RESTObject must be created (with create()), not initialized, since arguments could be coroutines, and need to be awaited, and __init__ does not support that interface
  6. Everything should be returned, since it can be couroutin
  7. every postprocess of response should be a function and be decorated with gitlab.utils.awaitable_postprocess, as it's done with _update_attrs and other edge cases
  8. unit tests are run by pytest and run against async and sync client implementations
  9. since unit tests are run with async gitlab, tests are required to be async, but fetching result is done through fixture that knows which client is used and return value if that's sync version or return await value if that's async version.
  10. GitlabList support two versions of iteration
  11. requests are completely removed

@nejch
Copy link
Member

nejch commented Mar 8, 2020

1. `response_content` no longer support `chunk_size` since `httpx` does not support
    this. ([encode/httpx#394](https://github.com/encode/httpx/issues/394))

I just noticed something: chunk_size is a configurable argument in several public methods (for blobs/snippet content, export download, etc) so it's possible people are using it and this would be a breaking change, right? Since you're replacing it with aiter_bytes and using chunks there, is there any way to preserve backward compatibility (I haven't checked the implementation)? Or wait for the upstream issue to resolve?

3. 3 implementations of gitlab clients: `gitlab.BaseGitlab` abstract implementation that share common methods and store all docsrings, `gitlab.Gitlab` is sync implementation with `httpx.Client`, `gitlab.AsyncGitlab` is async implementation with `httpx.AsyncClient`.
9. since unit tests are run with async gitlab, tests are required to be async, but fetching result is done through fixture that knows which client is used and `return value` if that's _sync_ version or `return await value` if that's `async` version.

That's really cool :)

@max-wittig
Copy link
Member

@vishes-shell Thanks for all the work that you're doing!

People could have been using chunk_size, by providing a different requests instance, right?
(Which is also not supported anymore then?)

But we have nothing about this officially in the docs, so I would say we can solve this by mentioning it in the release notes. We anyway need to bump the version to 3.X.

@nejch
Copy link
Member

nejch commented Mar 8, 2020

@max-wittig I tried earlier and at least in some cases it can even be just:

project = gl.projects.get(1)
export = project.exports.create({})
dl = export.download(chunk_size=512)

From what I see in the current solution by @vishes-shell this would still work and wouldn't break anything, the argument would just be ignored by response_content(). But if anyone relies on chunk_size for some specific reason, it might break their behavior. I'm not sure of all the use cases but it seems useful especially for project exports: https://stackoverflow.com/questions/46205586/why-to-use-iter-content-and-chunk-size-in-python-requests/46205745

requests>=2.22.0
respx>=0.12.1,<0.13
pytest
pytest-asyncio
Copy link

@graingert graingert Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use the anyio pytest plugin so you test on both trio and asyncio?

the latest httpx depends on anyio which bundles this plugin

@darkdragon-001
Copy link

darkdragon-001 commented Sep 16, 2021

What's the current state?

@darkdragon-001
Copy link

darkdragon-001 commented Sep 16, 2021

When using your library, I get the following warning:

httpx-0.18.2-py3.9.egg/httpx/_client.py:1978: UserWarning: Unclosed <httpx.AsyncClient object at 0x7ffabf61e280>. See https://www.python-httpx.org/async/#opening-and-closing-clients for details.

Apart from that, it works like a charm 😍

@darkdragon-001
Copy link

Btw: chunk_size is supported by httpx now.

@graingert
Copy link

When using your library, I get the following warning:

httpx-0.18.2-py3.9.egg/httpx/_client.py:1978: UserWarning: Unclosed <httpx.AsyncClient object at 0x7ffabf61e280>. See https://www.python-httpx.org/async/#opening-and-closing-clients for details.

Apart from that, it works like a charm 😍

You'll need to use the client with async with AsyncGitlab() as gitlab: ...

@darkdragon-001
Copy link

I am using it since a few months without any problems. chunk_size is implemented by now as well. Any other blockers? Or would it be accepted when updated to implement the latest features in the main branch?

@lmilbaum
Copy link
Contributor

As part of onboarding myself into this project, I stumbled into this PR. Could someone provide a use case when an async wrapper is being used?

@darkdragon-001
Copy link

@lmilbaum We are evaluating the jobs from the last day via script such that our QA team has a good overview about possible regressions or problems with our hardware text boxes (e.g. when the same code runs on one runner but not the other one). As network is slow, waiting for the answer of the previous request before submitting the next one takes quite a lot of time. If we can submit all requests asynchronously and just wait for all of them to finish in parallel, we have reduced the run time of our script by a multitude. Unfortunately, this MR is outdated and newer features are not available yet...

@nejch
Copy link
Member

nejch commented Oct 13, 2022

The wrapper here https://github.com/pan-net-security/aio-gitlab/ has a few points similar to what @darkdragon-001 described above. But true, this PR would need quite a bit of rework, or be used as starting point for a new one, since the library has changed a bit since then. I personally have not yet had enough need for this to work on it.

Another option would be to first switch from requests to httpx for the sync client (with a breaking change, because we're currently allowing people to bring their own requests sessions) and smooth out any kinks, then add async later, to make the transition less painful.

@lmilbaum
Copy link
Contributor

@darkdragon-001 and @nejch thanks for your feedback. Now, I can understand better the need for an async capability.
Would it make sense to move this discussion to Discussions while closing this PR?
The first discussion point might be the migration from requests to httpx as suggested by @nejch.

@nejch
Copy link
Member

nejch commented Oct 14, 2022

@darkdragon-001 and @nejch thanks for your feedback. Now, I can understand better the need for an async capability. Would it make sense to move this discussion to Discussions while closing this PR? The first discussion point might be the migration from requests to httpx as suggested by @nejch.

@lmilbaum thanks for looking into this! I'd keep this PR open even if as draft, just so it stays on our radar as there's a lot of useful discussion and work done here. And just so we remember to credit @vishes-shell as author whether they want to still work on this or not.

I'd maybe open a new issue to plan the switch to httpx as that would be for v4.0.0. We could then get rid of types-requests and requests-toolbelt I assume, as httpx is fully typed and does multipart encoding well IIRC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants