Skip to content

Commit 3ca3000

Browse files
authored
Support requests between APIs within the cluster (#1503)
1 parent c8da085 commit 3ca3000

File tree

2 files changed

+22
-1
lines changed

2 files changed

+22
-1
lines changed

Diff for: docs/deployments/realtime-api/predictors.md

+17
Original file line numberDiff line numberDiff line change
@@ -532,3 +532,20 @@ def predict(self, payload):
532532
content=data, media_type="text/plain")
533533
return response
534534
```
535+
536+
## Chaining APIs
537+
538+
It is possible to make requests from one API to another within a Cortex cluster. All running APIs are accessible from within the predictor at `http://api-<api_name>:8888/predict`, where `<api_name>` is the name of the API you are making a request to.
539+
540+
For example, if there is an api named `text-generator` running in the cluster, you could make a request to it from a different API by using:
541+
542+
```python
543+
import requests
544+
545+
class PythonPredictor:
546+
def predict(self, payload):
547+
response = requests.post("http://api-text-generator:8888/predict", json={"text": "machine learning is"})
548+
# ...
549+
```
550+
551+
Note that the autoscaling configuration (i.e. `target_replica_concurrency`) for the API that is making the request should be modified with the understanding that requests will still be considered "in-flight" with the first API as the request is being fulfilled in the second API (during which it will also be considered "in-flight" with the second API). See more details in the [autoscaling docs](autoscaling.md).

Diff for: pkg/workloads/cortex/serve/serve.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
from concurrent.futures import ThreadPoolExecutor
2222
import threading
2323
import math
24+
import uuid
2425
import asyncio
2526
from typing import Any
2627

@@ -121,7 +122,10 @@ async def register_request(request: Request, call_next):
121122
try:
122123
if is_prediction_request(request):
123124
if local_cache["provider"] != "local":
124-
request_id = request.headers["x-request-id"]
125+
if "x-request-id" in request.headers:
126+
request_id = request.headers["x-request-id"]
127+
else:
128+
request_id = uuid.uuid1()
125129
file_id = f"/mnt/requests/{request_id}"
126130
open(file_id, "a").close()
127131

0 commit comments

Comments
 (0)