-
Notifications
You must be signed in to change notification settings - Fork 31
Feat/cluster mode #181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat/cluster mode #181
Conversation
hhzguo
commented
May 19, 2025
- Update the README.doc
- Made changes to cluster_client.py after Dongmao's API refactoring
Fix badges since previous URLs are deprecated
1. Add TCP blocking APIs 2. Simplify RDMA APIs 3. rewrite all unittest and benchmark
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces cluster mode support by refactoring the server to register with Consul, exposing new management endpoints, updating the configuration class, providing a cluster client example, and documenting cluster mode setup.
- Extend server with health and service/config endpoints and CLI flags for cluster mode
- Add
hosttoServerConfigand export new cluster manager classes - Provide an example client and update README with cluster mode instructions
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| infinistore/server.py | Add health/config routes, cluster-mode CLI flags, and Consul registration logic |
| infinistore/lib.py | Include host attribute in ServerConfig |
| infinistore/example/cluster_client.py | New example demonstrating cluster-based RDMA calls |
| infinistore/init.py | Export ConsulClusterMgr and NoClusterMgr |
| README.md | Document how to run the server in cluster mode |
Comments suppressed due to low confidence (1)
infinistore/server.py:268
- The variable
loopis not defined in this scope. Useasyncio.get_event_loop()or capture the running event loop to schedule the refresh task.
loop.create_task(cluster_mgr.refresh_task())
infinistore/server.py
Outdated
| "link_type": config.link_type, | ||
| "dev_name": config.dev_name, | ||
| } | ||
| return JSONResponse(status_code=200, content=json.dumps(service_conf)) |
Copilot
AI
May 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing a JSON string to JSONResponse.content will double-encode the payload. Use content=service_conf or return the dict directly so FastAPI handles serialization correctly.
| return JSONResponse(status_code=200, content=json.dumps(service_conf)) | |
| return service_conf |
infinistore/server.py
Outdated
| @app.get("/health") | ||
| async def health(): | ||
| Logger.info(f"Health check received at {config.host}:{config.manage_port}...") | ||
| return "Healthy", 200 |
Copilot
AI
May 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning a tuple ("Healthy", 200) is not the recommended pattern in FastAPI. Return a Response or JSONResponse with an explicit status_code to ensure the status is applied.
| return "Healthy", 200 | |
| return Response(content="Healthy", status_code=200) |
README.md
Outdated
| * ```docker run -d \``` | ||
| * ``` --name=consul \``` | ||
| * ``` --network=host \``` | ||
| * ``` -e CONSUL_BIND_INTERFACE=eth0 \``` | ||
| * ``` -e CONSUL_CLIENT_ADDR=0.0.0.0 \``` | ||
| * ``` -p 8500:8500 \``` | ||
| * ``` hashicorp/consul``` |
Copilot
AI
May 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code fences are opened and closed on each line, breaking the block. Use a single fenced block (```shell) around all related commands for better readability.
| * ```docker run -d \``` | |
| * ``` --name=consul \``` | |
| * ``` --network=host \``` | |
| * ``` -e CONSUL_BIND_INTERFACE=eth0 \``` | |
| * ``` -e CONSUL_CLIENT_ADDR=0.0.0.0 \``` | |
| * ``` -p 8500:8500 \``` | |
| * ``` hashicorp/consul``` | |
| ```shell | |
| docker run -d \ | |
| --name=consul \ | |
| --network=host \ | |
| -e CONSUL_BIND_INTERFACE=eth0 \ | |
| -e CONSUL_CLIENT_ADDR=0.0.0.0 \ | |
| -p 8500:8500 \ | |
| hashicorp/consul |
| before_sync = time.time() | ||
| print(f"sync elapse time is {time.time() - before_sync}") | ||
|
|
Copilot
AI
May 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] This prints a 'sync' elapsed time without any synchronization step in between, which may confuse readers. Consider removing or clarifying this measurement.
| before_sync = time.time() | |
| print(f"sync elapse time is {time.time() - before_sync}") |