-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic routes #35
Comments
I don't understand this. What worker processes are these and what kinds of schemas do they create? Also, why would multiple processes create or modify the same schema(s)? Are we talking about workers getting data from other sources like IBS and SCC to create schemas describing specific products in aimaas? |
In a nutshell, aimaas allows clients to define relational data structures on top of a relational data structure via the EAV model. "Schemas" are like tables with column definitions and "Entities" are like rows in a relational database. Currently, API routes are registered for each schema and entity such that they are documented as exact as possible (i.e. models for describing/validating content of request and response) in the OpenAPI definition. So, if a client submits an HTTP request to create or edit a schema, the OpenAPI definition needs to be updated, too. This is done on the webserver by adjusting the registered routes while processing the request. And here problems might start: Production web servers (Apache, nginx, uvicorn, ...) typically run multiple processes/threads to serve the application. When one of these processes handles a request to create/update/delete a schema, it also triggers these changes to the route registrations. But, I think, this happens only in this process; other webserver processes would continue to serve the unchanged route information. This might cause the webserver to erroneously complain about missing/extraneous data or a newly created schema not being present. |
I don't know the details of how webserver adjusts the registered routes, but dynamic routes does sound appealing, if the webserver workers can synchronized after each route registration. If it is not applicable, then generic routes will be a good choice. |
Today I had to modify a POC for a data model in a staging instance. Changing schema definitions and adding/updating entities really becomes tedious as one needs to restart the server processes. I have marked this issue as a bug, because a normal user cannot be expected to restart server processes. |
@crazyscientist idk if it's still relevant, but I've got a couple of ideas that might work for you 1. Listening for schema changes, e.g. with Redis pub/subYou could add some global listening task like app = FastAPI()
r = redis.Redis(...)
def subscribe_to_changes():
pubsub = r.pubsub()
pubsub.subscribe('schema_changes')
schemas = get_schemas()
for message in pubsub.listen():
if message['type'] == 'message' and message['data'] == b'update_routes':
for schema in schemas:
create_dynamic_router(schema, app)
thread = threading.Thread(target=subscribe_to_changes)
thread.start() and when creating/updating schemas publish to 'schema_changes' channel. Downside of this approach is that now you have one more piece of infrastructure to maintain (even if it's not rocket science) 2. Keep last updated time somewhere in the database and check it from time to timeYou already have a database, so you can store there a timestamp of last creation or update of schemas. Then in workers we can store last seen timestamp and run a task which would periodically check the timestamp and recreate schemas/endpoints if needed. Something like def check_for_schema_updates():
last_known = None
while True:
current = get_last_modified_time()
if current and current != last_known:
last_known = current
schemas = get_schemas()
for schema in schemas:
create_dynamic_router(schema, app)
time.sleep(10) #or await asyncio.sleep(10) if doing async version
# it also can be a threading.Thread task
@app.on_event("startup")
async def startup_event():
loop = asyncio.get_event_loop()
# technically, this can be an async task that doesn't have to run in threadpool
# but don't forget to store a reference to it somewhere to not have it
# garbage collected (https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task)
loop.run_in_executor(None, check_for_schema_updates)
# also can use asyncio.to_thread With this and previous approach you'll probably want to make sure that task is still running. For a task running in separate thread this could be something like this: def check_for_schema_updates():
...
task_thread = threading.Thread(target=check_for_schema_updates)
task_thread.start()
@app.get("/check-thread")
def check_thread():
if task_thread and task_thread.is_alive():
return {"status": "Task is running"}
return {"status": "Task is not running"} Although, now you would have to come up with idea of how to restart worker(s) if task is no more active... 3. Rebuild docs on each access to docs endpointThis was the first idea that came to my mind. I didn't like it much but at least it won't require any additional infrastructure or tasks management. So, the idea is the following - you can write a middleware which on each access to docs will reset OpenAPI schema with However we still have problem with updating the actual endpoints/handlers - we can't recreate routers here because this way worker's endpoints will be stale until someone accesses docs and making the middleware work for all endpoints (instead of just /docs) doesn't sound like good solution either. What can help is having generic route which you mentioned at the beginning of the issue. I'm not sure if are thinking about the same thing, but this could be something like |
Currently, the backend is creating the OpenAPI spec using dynamic routes, i.e. for each schema routes with appropriate query/response/... models are defined.
➕ This has the advantage that the routes for every schema are accurately documented and that clients can interact with the routes directly from the docs page.
➖ On the other hand, on a production environment with multiple worker processes when a schema gets created or updated, the related dynamic routes only get created/updated for one worker process.
It would be good to find a mitigation for this drawback.
If this is not possible or would require too much work, the question would become, whether it is acceptable ...
This would imply that we will loose the detailed documentation for entity routes of specific schemas and the possibility to interactively work with the API from the spec document.
The text was updated successfully, but these errors were encountered: