diff --git a/sources/platform/integrations/ai/langchain.md b/sources/platform/integrations/ai/langchain.md index b5a99f3350..4d92af35ee 100644 --- a/sources/platform/integrations/ai/langchain.md +++ b/sources/platform/integrations/ai/langchain.md @@ -20,11 +20,11 @@ but if you prefer to use JavaScript, you can follow the same steps in the [JavaS Before we start with the integration, we need to install all dependencies: -`pip install apify-client langchain langchain_community openai tiktoken` +`pip install apify-client langchain langchain_community langchain_openai openai tiktoken` After successful installation of all dependencies, we can start writing code. -First, import `os`, `VectorstoreIndexCreator`, `ApifyWrapper`, and `Document` into your source code: +First, import all required packages: ```python import os @@ -32,6 +32,8 @@ import os from langchain.indexes import VectorstoreIndexCreator from langchain_community.utilities import ApifyWrapper from langchain_core.document_loaders.base import Document +from langchain_openai import OpenAI +from langchain_openai.embeddings import OpenAIEmbeddings ``` Find your [Apify API token](https://console.apify.com/account/integrations) and [OpenAI API key](https://platform.openai.com/account/api-keys) and initialize these into environment variable: @@ -57,22 +59,26 @@ loader = apify.call_actor( ) ``` -_NOTE: The Actor call function can take some time as it loads the data from LangChain documentation website._ +:::note Crawling may take some time + +The Actor call may take some time as it crawls the LangChain documentation website. + +::: Initialize the vector index from the crawled documents: ```python -index = VectorstoreIndexCreator().from_loaders([loader]) +index = VectorstoreIndexCreator(embedding=OpenAIEmbeddings()).from_loaders([loader]) ``` And finally, query the vector index: ```python query = "What is LangChain?" -result = index.query_with_sources(query) +result = index.query_with_sources(query, llm=OpenAI()) -print(result["answer"]) -print(result["sources"]) +print("answer:", result["answer"]) +print("source:", result["sources"]) ``` If you want to test the whole example, you can simply create a new file, `langchain_integration.py`, and copy the whole code into it. @@ -83,25 +89,27 @@ import os from langchain.indexes import VectorstoreIndexCreator from langchain_community.utilities import ApifyWrapper from langchain_core.document_loaders.base import Document +from langchain_openai import OpenAI +from langchain_openai.embeddings import OpenAIEmbeddings os.environ["OPENAI_API_KEY"] = "Your OpenAI API key" os.environ["APIFY_API_TOKEN"] = "Your Apify API token" apify = ApifyWrapper() +print("Call website content crawler ...") loader = apify.call_actor( actor_id="apify/website-content-crawler", run_input={"startUrls": [{"url": "https://python.langchain.com/docs/get_started/introduction"}], "maxCrawlPages": 10, "crawlerType": "cheerio"}, - dataset_mapping_function=lambda item: Document( - page_content=item["text"] or "", metadata={"source": item["url"]} - ), + dataset_mapping_function=lambda item: Document(page_content=item["text"] or "", metadata={"source": item["url"]}), ) -index = VectorstoreIndexCreator().from_loaders([loader]) +print("Compute embeddings...") +index = VectorstoreIndexCreator(embedding=OpenAIEmbeddings()).from_loaders([loader]) query = "What is LangChain?" -result = index.query_with_sources(query) +result = index.query_with_sources(query, llm=OpenAI()) -print(result["answer"]) -print(result["sources"]) +print("answer:", result["answer"]) +print("source:", result["sources"]) ``` To run it, you can use the following command: `python langchain_integration.py` @@ -109,9 +117,9 @@ To run it, you can use the following command: `python langchain_integration.py` After running the code, you should see the following output: ```text -LangChain is a framework for developing applications powered by language models. It provides standard, extendable interfaces, external integrations, and end-to-end implementations for off-the-shelf use. It also integrates with other LLMs, systems, and products to create a vibrant and thriving ecosystem. +answer: LangChain is a framework for developing applications powered by language models. It provides standard, extendable interfaces, external integrations, and end-to-end implementations for off-the-shelf use. It also integrates with other LLMs, systems, and products to create a vibrant and thriving ecosystem. -https://python.langchain.com +source: https://python.langchain.com ``` LangChain is a standard interface through which you can interact with a variety of large language models (LLMs). It provides modules you can use to build language model applications. It also provides chains and agents with memory capabilities.