-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for async Python #1295
Comments
We're interested in using Oso at the company where I work, but we're having a hard time figuring out how to ensure using Oso won't negatively impact latency (compared to hand-written authorization code). Our tech stack relies heavily on asynchronous Python:
We also use heavily use dataloaders (via the We'd like to use Oso to perform resource-based authorization, but are struggling to figure out how to efficiently perform database calls required to determine if an action on a resource is authorized for the current user. Ideally we could define all of our authorization policies and their data dependencies in our Our requests frequently involve many resources. For example, a user might want to list 100 resources in a single GraphQL request. We would like to be able to use async directives in our
The best workaround we've been able to come up with is prefetching authorization information in an asynchronous context before the Oso authorization query and passing in the relevant authorization information so it can be processed by Oso. Our preference though would be for Oso to be able to automatically decide whether or not this information even needs to be retrieved (for example, admin users don't need a database call). I'm very new to Oso, so it's very possible that I'm missing an easier solution to this problem, but in my mind support for async Python methods would be a great step in the right direction for us 😄 |
@connorbrinton Thanks for the write-up! That's super useful context to understand. From what you're describing, I'd say the workaround you describe (pre-fetching relevant data and making it available during the policy execution) is probably your best bet. I'm a bit worried that even with async support, the API still wouldn't enable the dataloader-like pattern you're describing. That might end up being a separate feature entirely (one that I quite like the sound of!). I do have a couple questions which might help me recommend a path:
|
I think having database call batching be separate from Oso's authorization logic definitely makes sense 👍 Part of the magic of dataloaders is that as long as Oso supports calling async methods, end-users can use dataloaders without any changes to Oso. I'm not terribly well-versed in
The simple implementation of the resource permission loader would be: class ResourcePermissionLoader:
async def user_can_access_resource(user, action, resource) -> bool:
# Make a database call to check the user's permissions to act on the given resource
...
return decision The DataLoader-based implementation would be: class ResourcePermissionLoader(Dataloader):
async def batch_load_fn(keys: Iterable[Tuple[User, Action, Resource]]) -> Iterable[bool]:
# Make a single database call to check permissions of all users to act on the corresponding resources
...
return decisions
async def user_can_access_resource(user, action, resource) -> bool:
return self.get((user, action, resource)) Both approaches work exactly the same from the perspective of Oso, but the Dataloader-based approach batches together all queries made in a single asynchronous tick, deduplicates them and makes a single database call to service all of the requests, reducing latency.
Yup! Each resource represents a text classification model, which we selectively allow our clients to access based on whether it's generally available or client-specific.
For our text classification models, we currently perform batch authorization decisions using a dataloader that we call manually whenever an authorization decision is needed. All authorization queries are batched together and the dataloader examines the following criteria for each query:
(1) and (2) are provided to our app through special headers (similar to a JWT), so we don't need to do any kind of special lookup to access that information. If (3) is necessary, the dataloader will issue a single database query retrieving information for every authorization query at once. It then performs some computation to determine (4). (1) and (2) are used for authorization decisions on multiple resource types, but (3) and (4) are only used for text classification model access decisions. (3) is the bit of information that we're interested in retrieving asynchronously so we can batch together similar requests. Long-term though, I think I would be interested in being able to retrieve (1) and (2) asynchronously as well. Currently we store all user roles in authorization headers injected by an API gateway, but I could see the size of those headers getting out of control eventually, necessitating some kind of out-of-band lookup for large or less-important roles or attributes. |
@connorbrinton Sorry it took me a second to reply here -- that's super helpful context. If I'm understanding correctly, I think this type of thing could be accomplished using our data filtering feature. Instead of performing steps 1-4 in code inside of your dataloader, you could take the user, call
Then, in whichever resolver you're loading the 1...100 models, you can use a subquery to make sure that the model's ID is inside of the set returned from the data filtering query. In a way, this means you're "eagerly" performing authorization as the data is being loaded, instead of performing it later by filtering the loaded data. The specifics of this depend on how text classification models are related to organizations, and you might need some hacks to make this work in an async context, but the basic concept might still apply. What do you think? Curious about one thing:
In that scenario, where would you expect (1) and (2) to be loaded from? E.g. from the same database as (3)? Some other service over an HTTP request? I ask because we're looking into better "data loading" features for Oso policies, and this sounds super relevant to that development. Definitely jump into our slack: https://join-slack.osohq.com/ -- we'd love to help out as you look into this more. More details on data filtering: https://docs.osohq.com/guides/data_filtering.html |
Definitely interested in this! Oso is getting mature and we would love to integrate it in our FastApi backend. So oso's async Python capability is the only thing that is holding us back. Any update on this? |
@gj, any updates yet? |
Hi folks, sorry for the silence. No update yet on this issue specifically. We're currently figure out how to open source some of the work we've done on Oso Cloud. More updates to come on that in #1703. |
Hey @gj , |
Hey folks: we have deprecated this package so we won't be able to add this feature in the near-term. In the medium/long-term, however, we expect to have a suitable replacement, and would definitely be interested in supporting async python there. |
This is an external tracking issue to:
So please:
Thanks!
P.S.: For now, we do all our internal engineering issue tracking separately in Notion, so you won't necessarily see regular updates to the project status here even once we begin work.
The text was updated successfully, but these errors were encountered: