Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic routes #35

Open
crazyscientist opened this issue Jan 10, 2022 · 5 comments
Open

Dynamic routes #35

crazyscientist opened this issue Jan 10, 2022 · 5 comments
Labels
bug Something isn't working

Comments

@crazyscientist
Copy link
Collaborator

Currently, the backend is creating the OpenAPI spec using dynamic routes, i.e. for each schema routes with appropriate query/response/... models are defined.

➕ This has the advantage that the routes for every schema are accurately documented and that clients can interact with the routes directly from the docs page.

➖ On the other hand, on a production environment with multiple worker processes when a schema gets created or updated, the related dynamic routes only get created/updated for one worker process.

It would be good to find a mitigation for this drawback.


If this is not possible or would require too much work, the question would become, whether it is acceptable ...

  • to define generic routes with generic models instead
  • to use additional validation against concrete models inside the view functions

This would imply that we will loose the detailed documentation for entity routes of specific schemas and the possibility to interactively work with the API from the spec document.

@der-gabe
Copy link
Member

On the other hand, on a production environment with multiple worker processes when a schema gets created or updated, the related dynamic routes only get created/updated for one worker process.

I don't understand this. What worker processes are these and what kinds of schemas do they create? Also, why would multiple processes create or modify the same schema(s)?

Are we talking about workers getting data from other sources like IBS and SCC to create schemas describing specific products in aimaas?

@crazyscientist
Copy link
Collaborator Author

In a nutshell, aimaas allows clients to define relational data structures on top of a relational data structure via the EAV model. "Schemas" are like tables with column definitions and "Entities" are like rows in a relational database.

Currently, API routes are registered for each schema and entity such that they are documented as exact as possible (i.e. models for describing/validating content of request and response) in the OpenAPI definition.

So, if a client submits an HTTP request to create or edit a schema, the OpenAPI definition needs to be updated, too. This is done on the webserver by adjusting the registered routes while processing the request.

And here problems might start: Production web servers (Apache, nginx, uvicorn, ...) typically run multiple processes/threads to serve the application. When one of these processes handles a request to create/update/delete a schema, it also triggers these changes to the route registrations. But, I think, this happens only in this process; other webserver processes would continue to serve the unchanged route information. This might cause the webserver to erroneously complain about missing/extraneous data or a newly created schema not being present.

@cherrywho
Copy link

I don't know the details of how webserver adjusts the registered routes, but dynamic routes does sound appealing, if the webserver workers can synchronized after each route registration.

If it is not applicable, then generic routes will be a good choice.

@crazyscientist crazyscientist added the bug Something isn't working label Oct 10, 2023
@crazyscientist
Copy link
Collaborator Author

Today I had to modify a POC for a data model in a staging instance. Changing schema definitions and adding/updating entities really becomes tedious as one needs to restart the server processes.

I have marked this issue as a bug, because a normal user cannot be expected to restart server processes.

@commanderprice
Copy link
Collaborator

@crazyscientist idk if it's still relevant, but I've got a couple of ideas that might work for you

1. Listening for schema changes, e.g. with Redis pub/sub

You could add some global listening task like

app = FastAPI()
r = redis.Redis(...)

def subscribe_to_changes():
    pubsub = r.pubsub()
    pubsub.subscribe('schema_changes')
    schemas = get_schemas()
    for message in pubsub.listen():
        if message['type'] == 'message' and message['data'] == b'update_routes':
            for schema in schemas:
                create_dynamic_router(schema, app)

thread = threading.Thread(target=subscribe_to_changes)
thread.start()

and when creating/updating schemas publish to 'schema_changes' channel. Downside of this approach is that now you have one more piece of infrastructure to maintain (even if it's not rocket science)

2. Keep last updated time somewhere in the database and check it from time to time

You already have a database, so you can store there a timestamp of last creation or update of schemas. Then in workers we can store last seen timestamp and run a task which would periodically check the timestamp and recreate schemas/endpoints if needed. Something like

def check_for_schema_updates():
    last_known = None
    while True:
        current = get_last_modified_time()
        if current and current != last_known:
            last_known = current
            schemas = get_schemas()
            for schema in schemas:
                create_dynamic_router(schema, app)
        time.sleep(10)  #or await asyncio.sleep(10) if doing async version

# it also can be a threading.Thread task
@app.on_event("startup")
async def startup_event():
    loop = asyncio.get_event_loop()
    # technically, this can be an async task that doesn't have to run in threadpool
    # but don't forget to store a reference to it somewhere to not have it
    # garbage collected (https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task)
    loop.run_in_executor(None, check_for_schema_updates)
    # also can use asyncio.to_thread

With this and previous approach you'll probably want to make sure that task is still running. For a task running in separate thread this could be something like this:

def check_for_schema_updates():
    ...

task_thread = threading.Thread(target=check_for_schema_updates)
task_thread.start()

@app.get("/check-thread")
def check_thread():
    if task_thread and task_thread.is_alive():
        return {"status": "Task is running"}
    return {"status": "Task is not running"}

Although, now you would have to come up with idea of how to restart worker(s) if task is no more active...

3. Rebuild docs on each access to docs endpoint

This was the first idea that came to my mind. I didn't like it much but at least it won't require any additional infrastructure or tasks management. So, the idea is the following - you can write a middleware which on each access to docs will reset OpenAPI schema with app.openapi_schema = None (just like it's currently done in create_dynamic_router()) or recreate it (app.openapi_schema = get_openapi(...), see usage example at definition of FastAPI.openapi()). First reason I don't like is that docs schema will be rebuilt each time it's accessed which adds some overhead. But maybe it's not too big and and it's not particularly important for docs endpoint.

However we still have problem with updating the actual endpoints/handlers - we can't recreate routers here because this way worker's endpoints will be stale until someone accesses docs and making the middleware work for all endpoints (instead of just /docs) doesn't sound like good solution either.

What can help is having generic route which you mentioned at the beginning of the issue. I'm not sure if are thinking about the same thing, but this could be something like @app.get('/entity/{schema_name}/...') where we resolve needed schema/entity type at runtime instead of pre-creating all handlers for each schema. The problem, though, is that now we can't rely on FastAPI's default way of creating OpenAPI schema, because it relies on app's handlers and we've just got rid of all dynamic ones. So, to make this work you'd also have to change the way of how the OpenAPI schema is generated ¯_(ツ)_/¯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants