Dynamic routes #35

crazyscientist · 2022-01-10T10:08:31Z

Currently, the backend is creating the OpenAPI spec using dynamic routes, i.e. for each schema routes with appropriate query/response/... models are defined.

➕ This has the advantage that the routes for every schema are accurately documented and that clients can interact with the routes directly from the docs page.

➖ On the other hand, on a production environment with multiple worker processes when a schema gets created or updated, the related dynamic routes only get created/updated for one worker process.

It would be good to find a mitigation for this drawback.

If this is not possible or would require too much work, the question would become, whether it is acceptable ...

to define generic routes with generic models instead
to use additional validation against concrete models inside the view functions

This would imply that we will loose the detailed documentation for entity routes of specific schemas and the possibility to interactively work with the API from the spec document.

der-gabe · 2022-01-13T09:43:07Z

On the other hand, on a production environment with multiple worker processes when a schema gets created or updated, the related dynamic routes only get created/updated for one worker process.

I don't understand this. What worker processes are these and what kinds of schemas do they create? Also, why would multiple processes create or modify the same schema(s)?

Are we talking about workers getting data from other sources like IBS and SCC to create schemas describing specific products in aimaas?

crazyscientist · 2022-01-13T10:17:26Z

In a nutshell, aimaas allows clients to define relational data structures on top of a relational data structure via the EAV model. "Schemas" are like tables with column definitions and "Entities" are like rows in a relational database.

Currently, API routes are registered for each schema and entity such that they are documented as exact as possible (i.e. models for describing/validating content of request and response) in the OpenAPI definition.

So, if a client submits an HTTP request to create or edit a schema, the OpenAPI definition needs to be updated, too. This is done on the webserver by adjusting the registered routes while processing the request.

And here problems might start: Production web servers (Apache, nginx, uvicorn, ...) typically run multiple processes/threads to serve the application. When one of these processes handles a request to create/update/delete a schema, it also triggers these changes to the route registrations. But, I think, this happens only in this process; other webserver processes would continue to serve the unchanged route information. This might cause the webserver to erroneously complain about missing/extraneous data or a newly created schema not being present.

cherrywho · 2022-01-20T09:50:17Z

I don't know the details of how webserver adjusts the registered routes, but dynamic routes does sound appealing, if the webserver workers can synchronized after each route registration.

If it is not applicable, then generic routes will be a good choice.

crazyscientist · 2023-10-10T08:39:05Z

Today I had to modify a POC for a data model in a staging instance. Changing schema definitions and adding/updating entities really becomes tedious as one needs to restart the server processes.

I have marked this issue as a bug, because a normal user cannot be expected to restart server processes.

commanderprice · 2024-11-10T16:27:24Z

@crazyscientist idk if it's still relevant, but I've got a couple of ideas that might work for you

1. Listening for schema changes, e.g. with Redis pub/sub

You could add some global listening task like

app = FastAPI()
r = redis.Redis(...)

def subscribe_to_changes():
    pubsub = r.pubsub()
    pubsub.subscribe('schema_changes')
    schemas = get_schemas()
    for message in pubsub.listen():
        if message['type'] == 'message' and message['data'] == b'update_routes':
            for schema in schemas:
                create_dynamic_router(schema, app)

thread = threading.Thread(target=subscribe_to_changes)
thread.start()

and when creating/updating schemas publish to 'schema_changes' channel. Downside of this approach is that now you have one more piece of infrastructure to maintain (even if it's not rocket science)

2. Keep last updated time somewhere in the database and check it from time to time

You already have a database, so you can store there a timestamp of last creation or update of schemas. Then in workers we can store last seen timestamp and run a task which would periodically check the timestamp and recreate schemas/endpoints if needed. Something like

def check_for_schema_updates():
    last_known = None
    while True:
        current = get_last_modified_time()
        if current and current != last_known:
            last_known = current
            schemas = get_schemas()
            for schema in schemas:
                create_dynamic_router(schema, app)
        time.sleep(10)  #or await asyncio.sleep(10) if doing async version

# it also can be a threading.Thread task
@app.on_event("startup")
async def startup_event():
    loop = asyncio.get_event_loop()
    # technically, this can be an async task that doesn't have to run in threadpool
    # but don't forget to store a reference to it somewhere to not have it
    # garbage collected (https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task)
    loop.run_in_executor(None, check_for_schema_updates)
    # also can use asyncio.to_thread

With this and previous approach you'll probably want to make sure that task is still running. For a task running in separate thread this could be something like this:

def check_for_schema_updates():
    ...

task_thread = threading.Thread(target=check_for_schema_updates)
task_thread.start()

@app.get("/check-thread")
def check_thread():
    if task_thread and task_thread.is_alive():
        return {"status": "Task is running"}
    return {"status": "Task is not running"}

Although, now you would have to come up with idea of how to restart worker(s) if task is no more active...

3. Rebuild docs on each access to docs endpoint

This was the first idea that came to my mind. I didn't like it much but at least it won't require any additional infrastructure or tasks management. So, the idea is the following - you can write a middleware which on each access to docs will reset OpenAPI schema with app.openapi_schema = None (just like it's currently done in create_dynamic_router()) or recreate it (app.openapi_schema = get_openapi(...), see usage example at definition of FastAPI.openapi()). First reason I don't like is that docs schema will be rebuilt each time it's accessed which adds some overhead. But maybe it's not too big and and it's not particularly important for docs endpoint.

However we still have problem with updating the actual endpoints/handlers - we can't recreate routers here because this way worker's endpoints will be stale until someone accesses docs and making the middleware work for all endpoints (instead of just /docs) doesn't sound like good solution either.

What can help is having generic route which you mentioned at the beginning of the issue. I'm not sure if are thinking about the same thing, but this could be something like @app.get('/entity/{schema_name}/...') where we resolve needed schema/entity type at runtime instead of pre-creating all handlers for each schema. The problem, though, is that now we can't rely on FastAPI's default way of creating OpenAPI schema, because it relies on app's handlers and we've just got rid of all dynamic ones. So, to make this work you'd also have to change the way of how the OpenAPI schema is generated ¯_(ツ)_/¯

crazyscientist added the bug Something isn't working label Oct 10, 2023

crazyscientist added this to the ALP improvements milestone Oct 16, 2023

This was referenced Oct 18, 2023

Support recovery of deleted entities (fixes #88) #156

Merged

Split up ambigous DELETE route #162

Open

crazyscientist mentioned this issue Nov 29, 2023

Allow filtering of entities by attributes of type "reference" #170

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic routes #35

Dynamic routes #35

crazyscientist commented Jan 10, 2022

der-gabe commented Jan 13, 2022

crazyscientist commented Jan 13, 2022

cherrywho commented Jan 20, 2022

crazyscientist commented Oct 10, 2023

commanderprice commented Nov 10, 2024

Dynamic routes #35

Dynamic routes #35

Comments

crazyscientist commented Jan 10, 2022

der-gabe commented Jan 13, 2022

crazyscientist commented Jan 13, 2022

cherrywho commented Jan 20, 2022

crazyscientist commented Oct 10, 2023

commanderprice commented Nov 10, 2024

1. Listening for schema changes, e.g. with Redis pub/sub

2. Keep last updated time somewhere in the database and check it from time to time

3. Rebuild docs on each access to docs endpoint