Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thoughts on async implementation #72

Open
rafalkrupinski opened this issue Jan 21, 2024 · 7 comments
Open

thoughts on async implementation #72

rafalkrupinski opened this issue Jan 21, 2024 · 7 comments
Labels
question Further information is requested

Comments

@rafalkrupinski
Copy link
Contributor

rafalkrupinski commented Jan 21, 2024

Thank you @Colin-b and all contributors for this awesome package!

Just wanted to share some thoughts on possibility of async OAuth2 flow. I don't know all the corner cases this package handles, so please let me know whenever I write something naive.

  1. OAuth2 flows use a global cache with a (threading) lock. It's a problem for async code (Support for async httpx clients #48 (comment)), but also doesn't seem necessary. The token and lock could be simply instance variables. The only problem that that I can think of, is when user creates multiple instances with the same auth server and key. Is there a valid case for such usage?

  2. Actually the cache holds two locks, one for accessing cache, the other for refreshing the lock. Again this seem unnecessary:

    • when token is valid, the first concurrent call can acquire lock, check expiry and release it, while others wait for a short time.
    • when token is expired, the first concurrent call can acquire the lock, check expiry and refresh the token, while others have to wait until it finishes
  3. The OAuth2 flows use locks within .auth_flow() method, against the Auth documentation.

    If the authentication scheme does I/O such as disk access or network calls, or uses
    synchronization primitives such as locks, you should override .sync_auth_flow()
    and/or .async_auth_flow() instead of .auth_flow() to provide specialized
    implementations that will be used by Client and AsyncClient respectively.

    This is addressed in Support for async httpx clients #48 patch.
    Having two locks makes sense when storage and refresh were split. In this case acquiring the refresh lock could be pulled to .a/sync_auth_flow().

  4. .auth_flow(), via .request_new_token() uses own instance of httpx.Client. Again, it doesn't seem necessary. Httpx already provides a/sync-portable protocol for making HTTP requests from Auth instances: response = yield request.

    If the auth server needs different transport options than the target server, mounts can be used. And if authentication is needed by the authentication server (meta-auth, client_auth param), the yielded requests need to be processed by meta-auth first.

  5. Since all OAuth2 implementations in this package follow the pattern:

    1. try getting cached token
    2. if expired, fallback to self.request_new_grant()
    3. set header

    steps 1+2 run with a lock, and 3 is very lightweight, the entire flow can run with a single lock.

When the above are addressed, supporting both sync and async should be as simple as implementing this class:

class LockingRefreshingAuth(Auth):
    def __init__(self):
        self._sync_lock = threading.Lock()
        self._async_lock = anyio.Lock()

    def sync_auth_flow(self, request: Request) -> Generator[Request, Response, None]:
        if self.requires_request_body:
            request.read()

        with self._sync_lock:
            # apply the loop over `self.auth_flow()`
            flow = self.auth_flow(request)
            ...

    async def async_auth_flow(self, request: Request) -> AsyncGenerator[Request, Response]:
        # like above, but async

Implementations would still yield their auth requests instead of directly using Client, and yield the user request before returning.


I've forked the repo to work on a POC. If it makes sense, and it's welcome, I'll make a PR. Meanwhile, comments are more than welcome.


edit 1: I've just noticed that auth token refresh requests also need to have authentication headers, and I guess that's why a client is used. But I still think it's unnecessary, and simply another Auth instance can be used

edit 2: It makes sense to have two locks if one lock protects a single Auth instance and the other the global cache.

rafalkrupinski pushed a commit to python-lapidary/httpx_auth_async that referenced this issue Jan 23, 2024
rafalkrupinski pushed a commit to python-lapidary/httpx_auth_async that referenced this issue Jan 23, 2024
rafalkrupinski pushed a commit to python-lapidary/httpx_auth_async that referenced this issue Jan 23, 2024
rafalkrupinski pushed a commit to python-lapidary/httpx_auth_async that referenced this issue Jan 23, 2024
rafalkrupinski pushed a commit to python-lapidary/httpx_auth_async that referenced this issue Jan 25, 2024
rafalkrupinski pushed a commit to python-lapidary/httpx_auth_async that referenced this issue Jan 26, 2024
@Colin-b
Copy link
Owner

Colin-b commented Jan 28, 2024

Hello @rafalkrupinski
I am currently looking at finally introducing async coverage in the test suite. It requires some refactoring of the test suite, I'll keep you posted, once it's done I will consider reviewing your suggestions and the PR that is opened for quite some time already on this.

Thanks a lot for your patience

@rafalkrupinski
Copy link
Contributor Author

rafalkrupinski commented Jan 30, 2024

I've just pushed a commit that removes the inner Client from Auth classes - https://github.com/python-lapidary/httpx_auth_async/tree/feature/async. If you like the direction I can separate the flows to sync and async versions.

edit:
The implementation itroduces meta-auth - using Auth object to authenticate auth flow requests. In other words, when Auth object needs to make authenticated request to the auth server, another auth object is used.

Because of that, and because auth flows need to read responses (which may be either sync or async), there isn't that much of implementation to share between a/sync versions.

It's possible that I know OAuth2 too little, and a simpler implementation is possible, e.g. if only HTTP Basic authentication is ever used, it could be simply done in the OAuth class, without using meta-auth.

Cheers

@Colin-b
Copy link
Owner

Colin-b commented Feb 4, 2024

The test suite is now ensuring non regression on async flows as well. Feel free to submit small scale focused PRs and I will gladly review.

The client is a parameter that is necessary because some users might have rules that requires specific proxy settings, certificates or whatever and this is not the goal of this package to reproduce every feature httpx offers to cover those cases, so we delegate to the user, the responsibility to provide a properly configured client in case they need to.

@rafalkrupinski
Copy link
Contributor Author

Is it something mounts can't handle?

@Colin-b
Copy link
Owner

Colin-b commented Feb 4, 2024

I am not sure to follow, if the client has custom rules to achieve auth and an additional set of rules to reach the authenticated endpoint, how would you do it without providing N parameters to every auth class that would need to be forwarded to the inner client of the auth class? Let's say your certificate differs and your proxy differs (mounts can cover proxy on client side, but not certificate). And this is only an example I can think of, I am sure we would have to cover a whole lot of use cases, corporate networks are a nightmare ;)

@rafalkrupinski
Copy link
Contributor Author

if the client has custom rules to achieve auth and an additional set of rules to reach the authenticated endpoint, how would you do it without providing N parameters to every auth

How large is N? Cookies, headers, query params, body... perhaps a single parameter request_args: dict?
Proxy and cert should be handled by mounts...

mounts can cover proxy on client side, but not certificate

This doesn't work?

httpx.AsyncClient(mounts={'https://example.com', httpx.AsyncHTTPTransport(verify=ssl.SSLContext(...) or cert=...)})

And this is only an example I can think of, I am sure we would have to cover a whole lot of use cases, corporate networks are a nightmare ;)

Would be a great help if I knew some corner cases.

@Colin-b
Copy link
Owner

Colin-b commented Feb 10, 2024

You are right, I guess it's possible to do everything via mounts, though you would admit it's not as straightforward to provide for a client.
Anyway I will gladly review your proposal on async. If you could come up with the smallest possible enhancements that would help the review and increase the chances for it to get merged.

Thanks again !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants