How to chain RemoteRunnable clients to local llm server (hosted using langserve)? #28647

jianlins · 2024-12-10T09:54:31Z

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

On the server side, I used HuggingFacePipeline to load a local model

from fastapi import FastAPI
# from langchain_anthropic import ChatAnthropic
from langchain_openai import ChatOpenAI
import transformers
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain_huggingface.llms import HuggingFacePipeline
from langchain_huggingface import ChatHuggingFace
from langserve import add_routes
import torch
import os
cache_dir = "./transforms_cache"
os.environ['TRANSFORMERS_CACHE'] = cache_dir
os.environ['HF_HOME']=cache_dir

transformers.utils.move_cache(new_cache_dir=cache_dir)

app = FastAPI(
    title="LangChain Server",
    version="1.0",
    description="Spin up a simple api server using Langchain's Runnable interfaces",
)


model_name = "allenai/Llama-3.1-Tulu-3-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name,cache_dir=cache_dir)
tulu_model = AutoModelForCausalLM.from_pretrained(model_name,cache_dir=cache_dir,
                                                  torch_dtype=torch.float16, device_map="auto",)

hf_pipeline = pipeline("text-generation", model=tulu_model, tokenizer=tokenizer, max_new_tokens=6400)
hf = HuggingFacePipeline(pipeline=hf_pipeline)
chat=ChatHuggingFace(llm=hf)


app = FastAPI(title="LLM Server", version="1.0")

# Add the LLM to the server
add_routes(app, chat, path="/llm")


if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, host="0.0.0.0", port=8099)

Client side, use RemoteRunnable to connect, although it can successfully invoke an input string, it failed to be applied in LLMChain

from langserve import RemoteRunnable
from langchain import LLMChain, PromptTemplate
from langchain.chains import SimpleSequentialChain

# Create a RemoteRunnable that points to your deployed model endpoint
remote_llm = RemoteRunnable(url="http://0.0.0.0:8099/llm")

# Define prompt templates for each step of your chain
capital_prompt = PromptTemplate.from_template("What is the capital city of {country}?")
population_prompt = PromptTemplate.from_template("What is the population of {city}?")

# Create two LLMChains:
# 1. The first chain takes a country and returns the capital city.
chain1 = LLMChain(llm=remote_llm, prompt=capital_prompt)

# 2. The second chain takes the city name (returned by chain1) and returns the population.
chain2 = LLMChain(llm=remote_llm, prompt=population_prompt)

# Combine them into a SimpleSequentialChain:
# SimpleSequentialChain by default passes the output of the first chain
# as the input to the second chain.
overall_chain = SimpleSequentialChain(chains=[chain1, chain2], verbose=True)

# Run the combined chain:
result = overall_chain.run("France")

Error Message and Stack Trace (if applicable)

site-packages/langserve/client.py:448, in RemoteRunnable.batch(self, inputs, config, return_exceptions, **kwargs)
439 def batch(
440 self,
441 inputs: List[Input],
(...)
445 **kwargs: Any,
446 ) -> List[Output]:
447 if kwargs:
--> 448 raise NotImplementedError(f"kwargs not implemented yet. Got {kwargs}")
449 return self._batch_with_config(
450 self._batch, inputs, config, return_exceptions=return_exceptions
451 )

NotImplementedError: kwargs not implemented yet. Got {'stop': None}

Description

I try to use langserve to start a server and use RemoteRunnable as clients to communicate with it. This is helpful to try multiple time without worry about client failure, because restart a client is way faster than reload a llm model. Although, I can do simple llm.invoke using RemoteRunnable, but I cannot use any Chain classes, e.g. LLMChain, SimpleSequentialChain, SequentialChain.

System Info

System Information

OS: Linux
OS Version: #1 SMP Thu Jun 6 09:41:19 UTC 2024
Python Version: 3.10.16 | packaged by conda-forge | (main, Dec 5 2024, 14:16:10) [GCC 13.3.0]

Package Information

langchain_core: 0.3.22
langchain: 0.3.10
langchain_community: 0.3.10
langsmith: 0.1.147
langchain_huggingface: 0.1.2
langchain_openai: 0.2.11
langchain_text_splitters: 0.3.2
langgraph_sdk: 0.1.43
langserve: 0.3.0

Other Dependencies

aiohttp: 3.11.10
async-timeout: 4.0.3
dataclasses-json: 0.6.7
fastapi: 0.115.6
httpx: 0.28.1
httpx-sse: 0.4.0
huggingface-hub: 0.26.5
jsonpatch: 1.33
langsmith-pyo3: Installed. No version info available.
numpy: 1.26.4
openai: 1.57.0
orjson: 3.10.12
packaging: 24.2
pydantic: 2.10.3
pydantic-settings: 2.6.1
PyYAML: 6.0.2
requests: 2.32.3
requests-toolbelt: 1.0.0
sentence-transformers: 3.3.1
SQLAlchemy: 2.0.36
sse-starlette: 1.8.2
tenacity: 9.0.0
tiktoken: 0.8.0
tokenizers: 0.21.0
transformers: 4.47.0
typing-extensions: 4.12.2

keenborder786 · 2024-12-11T10:59:09Z

@jianlins You are using Deprecate way of using, the recommended way is to use LCEL. So you can achieve the same result using as follow:

from langserve import RemoteRunnable
from langchain import PromptTemplate


# Create a RemoteRunnable that points to your deployed model endpoint
remote_llm = RemoteRunnable(url="http://0.0.0.0:8099/llm")

# Define prompt templates for each step of your chain
capital_prompt = PromptTemplate.from_template("What is the capital city of {country}?")
population_prompt = PromptTemplate.from_template("What is the population of {city}?")
overall_chain = capital_prompt | remote_llm

# Run the combined chain:
result = overall_chain.invoke("France")

dosubot bot added the Ɑ: core Related to langchain-core label Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to chain RemoteRunnable clients to local llm server (hosted using langserve)? #28647

How to chain RemoteRunnable clients to local llm server (hosted using langserve)? #28647

jianlins commented Dec 10, 2024 •

edited

Loading

keenborder786 commented Dec 11, 2024

How to chain RemoteRunnable clients to local llm server (hosted using langserve)? #28647

How to chain RemoteRunnable clients to local llm server (hosted using langserve)? #28647

Comments

jianlins commented Dec 10, 2024 • edited Loading

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Other Dependencies

keenborder786 commented Dec 11, 2024

jianlins commented Dec 10, 2024 •

edited

Loading