diff --git a/docs/core_docs/docs/integrations/document_loaders/file_loaders/csv.ipynb b/docs/core_docs/docs/integrations/document_loaders/file_loaders/csv.ipynb new file mode 100644 index 000000000000..5f0f34c143d5 --- /dev/null +++ b/docs/core_docs/docs/integrations/document_loaders/file_loaders/csv.ipynb @@ -0,0 +1,226 @@ +{ + "cells": [ + { + "cell_type": "raw", + "metadata": { + "vscode": { + "languageId": "raw" + } + }, + "source": [ + "---\n", + "sidebar_label: CSV\n", + "sidebar_class_name: node-only\n", + "---" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# CSVLoader\n", + "\n", + "```{=mdx}\n", + "\n", + ":::tip Compatibility\n", + "\n", + "Only available on Node.js.\n", + "\n", + ":::\n", + "\n", + "```\n", + "\n", + "This notebook provides a quick overview for getting started with `CSVLoader` [document loaders](/docs/concepts/#document-loaders). For detailed documentation of all `CSVLoader` features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_community_document_loaders_fs_csv.CSVLoader.html).\n", + "\n", + "This example goes over how to load data from CSV files. The second argument is the `column` name to extract from the CSV file. One document will be created for each row in the CSV file. When `column` is not specified, each row is converted into a key/value pair with each key/value pair outputted to a new line in the document's `pageContent`. When `column` is specified, one document is created for each row, and the value of the specified column is used as the document's `pageContent`.\n", + "\n", + "## Overview\n", + "### Integration details\n", + "\n", + "| Class | Package | Compatibility | Local | [PY support](https://python.langchain.com/docs/integrations/document_loaders/csv)| \n", + "| :--- | :--- | :---: | :---: | :---: |\n", + "| [CSVLoader](https://api.js.langchain.com/classes/langchain_community_document_loaders_fs_csv.CSVLoader.html) | [@langchain/community](https://api.js.langchain.com/modules/langchain_community_document_loaders_fs_csv.html) | Node-only | ✅ | ✅ |\n", + "\n", + "## Setup\n", + "\n", + "To access `CSVLoader` document loader you'll need to install the `@langchain/community` integration, along with the `d3-dsv@2` peer dependency.\n", + "\n", + "### Installation\n", + "\n", + "The LangChain CSVLoader integration lives in the `@langchain/community` integration package.\n", + "\n", + "```{=mdx}\n", + "import IntegrationInstallTooltip from \"@mdx_components/integration_install_tooltip.mdx\";\n", + "import Npm2Yarn from \"@theme/Npm2Yarn\";\n", + "\n", + "\n", + "\n", + "\n", + " @langchain/community d3-dsv@2\n", + "\n", + "\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and load documents:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "import { CSVLoader } from \"@langchain/community/document_loaders/fs/csv\"\n", + "\n", + "const exampleCsvPath = \"../../../../../../langchain/src/document_loaders/tests/example_data/example_separator.csv\";\n", + "\n", + "const loader = new CSVLoader(exampleCsvPath)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Load" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Document {\n", + " pageContent: 'id|html: 1|\"Corruption discovered at the core of the Banking Clan!\"',\n", + " metadata: {\n", + " source: '../../../../../../langchain/src/document_loaders/tests/example_data/example_separator.csv',\n", + " line: 1\n", + " },\n", + " id: undefined\n", + "}\n" + ] + } + ], + "source": [ + "const docs = await loader.load()\n", + "docs[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " source: '../../../../../../langchain/src/document_loaders/tests/example_data/example_separator.csv',\n", + " line: 1\n", + "}\n" + ] + } + ], + "source": [ + "console.log(docs[0].metadata)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Usage, extracting a single column\n", + "\n", + "Example CSV file:\n", + "\n", + "```csv\n", + "id|html\n", + "1|\"Corruption discovered at the core of the Banking Clan!\"\n", + "2|\"Reunited, Rush Clovis and Senator Amidala\"\n", + "3|\"discover the full extent of the deception.\"\n", + "4|\"Anakin Skywalker is sent to the rescue!\"\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Document {\n", + " pageContent: 'Corruption discovered at the core of the Banking Clan!',\n", + " metadata: {\n", + " source: '../../../../../../langchain/src/document_loaders/tests/example_data/example_separator.csv',\n", + " line: 1\n", + " },\n", + " id: undefined\n", + "}\n" + ] + } + ], + "source": [ + "import { CSVLoader } from \"@langchain/community/document_loaders/fs/csv\";\n", + "\n", + "const singleColumnLoader = new CSVLoader(\n", + " exampleCsvPath,\n", + " {\n", + " column: \"html\",\n", + " separator:\"|\"\n", + " }\n", + ");\n", + "\n", + "const singleColumnDocs = await singleColumnLoader.load();\n", + "console.log(singleColumnDocs[0]);" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "For detailed documentation of all CSVLoader features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain_community_document_loaders_fs_csv.CSVLoader.html" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "TypeScript", + "language": "typescript", + "name": "tslab" + }, + "language_info": { + "codemirror_mode": { + "mode": "typescript", + "name": "javascript", + "typescript": true + }, + "file_extension": ".ts", + "mimetype": "text/typescript", + "name": "typescript", + "version": "3.7.2" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/docs/core_docs/docs/integrations/document_loaders/file_loaders/csv.mdx b/docs/core_docs/docs/integrations/document_loaders/file_loaders/csv.mdx deleted file mode 100644 index e9adf18540a8..000000000000 --- a/docs/core_docs/docs/integrations/document_loaders/file_loaders/csv.mdx +++ /dev/null @@ -1,90 +0,0 @@ -# CSV files - -This example goes over how to load data from CSV files. The second argument is the `column` name to extract from the CSV file. One document will be created for each row in the CSV file. When `column` is not specified, each row is converted into a key/value pair with each key/value pair outputted to a new line in the document's `pageContent`. When `column` is specified, one document is created for each row, and the value of the specified column is used as the document's pageContent. - -## Setup - -```bash npm2yarn -npm install d3-dsv@2 -``` - -## Usage, extracting all columns - -Example CSV file: - -```csv -id,text -1,This is a sentence. -2,This is another sentence. -``` - -Example code: - -```typescript -import { CSVLoader } from "@langchain/community/document_loaders/fs/csv"; - -const loader = new CSVLoader("src/document_loaders/example_data/example.csv"); - -const docs = await loader.load(); -/* -[ - Document { - "metadata": { - "line": 1, - "source": "src/document_loaders/example_data/example.csv", - }, - "pageContent": "id: 1 -text: This is a sentence.", - }, - Document { - "metadata": { - "line": 2, - "source": "src/document_loaders/example_data/example.csv", - }, - "pageContent": "id: 2 -text: This is another sentence.", - }, -] -*/ -``` - -## Usage, extracting a single column - -Example CSV file: - -```csv -id,text -1,This is a sentence. -2,This is another sentence. -``` - -Example code: - -```typescript -import { CSVLoader } from "@langchain/community/document_loaders/fs/csv"; - -const loader = new CSVLoader( - "src/document_loaders/example_data/example.csv", - "text" -); - -const docs = await loader.load(); -/* -[ - Document { - "metadata": { - "line": 1, - "source": "src/document_loaders/example_data/example.csv", - }, - "pageContent": "This is a sentence.", - }, - Document { - "metadata": { - "line": 2, - "source": "src/document_loaders/example_data/example.csv", - }, - "pageContent": "This is another sentence.", - }, -] -*/ -```