Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs[minor]: Update CSV doc #6345

Merged
merged 2 commits into from
Aug 2, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
docs[minor]: Update CSV doc
bracesproul committed Aug 2, 2024

Verified

This commit was signed with the committer’s verified signature. The key has expired.
rcorre Ryan Roden-Corrent
commit 3eee0319629b64ac5ef55226f3811d3c62b72d82
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {
"vscode": {
"languageId": "raw"
}
},
"source": [
"---\n",
"sidebar_label: CSV\n",
"sidebar_class_name: node-only\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# CSVLoader\n",
"\n",
"```{=mdx}\n",
"\n",
":::tip Compatibility\n",
"\n",
"Only available on Node.js.\n",
"\n",
":::\n",
"\n",
"```\n",
"\n",
"This notebook provides a quick overview for getting started with `CSVLoader` [document loaders](/docs/concepts/#document-loaders). For detailed documentation of all `CSVLoader` features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_community_document_loaders_fs_csv.CSVLoader.html).\n",
"\n",
"This example goes over how to load data from CSV files. The second argument is the `column` name to extract from the CSV file. One document will be created for each row in the CSV file. When `column` is not specified, each row is converted into a key/value pair with each key/value pair outputted to a new line in the document's `pageContent`. When `column` is specified, one document is created for each row, and the value of the specified column is used as the document's `pageContent`.\n",
"\n",
"## Overview\n",
"### Integration details\n",
"\n",
"| Class | Package | Compatibility | Local | [PY support](https://python.langchain.com/docs/integrations/document_loaders/csv)| \n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"| [CSVLoader](https://api.js.langchain.com/classes/langchain_community_document_loaders_fs_csv.CSVLoader.html) | [@langchain/community](https://api.js.langchain.com/modules/langchain_community_document_loaders_fs_csv.html) | Node-only | ✅ | ✅ |\n",
"\n",
"## Setup\n",
"\n",
"To access `CSVLoader` document loader you'll need to install the `@langchain/community` integration, along with the `d3-dsv@2` peer dependency.\n",
"\n",
"### Installation\n",
"\n",
"The LangChain CSVLoader integration lives in the `@langchain/community` integration package.\n",
"\n",
"```{=mdx}\n",
"import IntegrationInstallTooltip from \"@mdx_components/integration_install_tooltip.mdx\";\n",
"import Npm2Yarn from \"@theme/Npm2Yarn\";\n",
"\n",
"<IntegrationInstallTooltip></IntegrationInstallTooltip>\n",
"\n",
"<Npm2Yarn>\n",
" @langchain/community d3-dsv@2\n",
"</Npm2Yarn>\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Instantiation\n",
"\n",
"Now we can instantiate our model object and load documents:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"import { CSVLoader } from \"@langchain/community/document_loaders/fs/csv\"\n",
"\n",
"const exampleCsvPath = \"../../../../../../langchain/src/document_loaders/tests/example_data/example_separator.csv\";\n",
"\n",
"const loader = new CSVLoader(exampleCsvPath)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Document {\n",
" pageContent: 'id|html: 1|\"<i>Corruption discovered at the core of the Banking Clan!</i>\"',\n",
" metadata: {\n",
" source: '../../../../../../langchain/src/document_loaders/tests/example_data/example_separator.csv',\n",
" line: 1\n",
" },\n",
" id: undefined\n",
"}\n"
]
}
],
"source": [
"const docs = await loader.load()\n",
"docs[0]"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" source: '../../../../../../langchain/src/document_loaders/tests/example_data/example_separator.csv',\n",
" line: 1\n",
"}\n"
]
}
],
"source": [
"console.log(docs[0].metadata)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Usage, extracting a single column\n",
"\n",
"Example CSV file:\n",
"\n",
"```csv\n",
"id|html\n",
"1|\"<i>Corruption discovered at the core of the Banking Clan!</i>\"\n",
"2|\"<i>Reunited, Rush Clovis and Senator Amidala</i>\"\n",
"3|\"<i>discover the full extent of the deception.</i>\"\n",
"4|\"<i>Anakin Skywalker is sent to the rescue!</i>\"\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Document {\n",
" pageContent: '<i>Corruption discovered at the core of the Banking Clan!</i>',\n",
" metadata: {\n",
" source: '../../../../../../langchain/src/document_loaders/tests/example_data/example_separator.csv',\n",
" line: 1\n",
" },\n",
" id: undefined\n",
"}\n"
]
}
],
"source": [
"import { CSVLoader } from \"@langchain/community/document_loaders/fs/csv\";\n",
"\n",
"const singleColumnLoader = new CSVLoader(\n",
" exampleCsvPath,\n",
" {\n",
" column: \"html\",\n",
" separator:\"|\"\n",
" }\n",
");\n",
"\n",
"const singleColumnDocs = await singleColumnLoader.load();\n",
"console.log(singleColumnDocs[0]);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all CSVLoader features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain_community_document_loaders_fs_csv.CSVLoader.html"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "TypeScript",
"language": "typescript",
"name": "tslab"
},
"language_info": {
"codemirror_mode": {
"mode": "typescript",
"name": "javascript",
"typescript": true
},
"file_extension": ".ts",
"mimetype": "text/typescript",
"name": "typescript",
"version": "3.7.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}