-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use llamacloud pipeline in TS #236
Conversation
🦋 Changeset detectedLatest commit: 228ae88 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
WalkthroughThis update enhances the LlamaCloud pipeline for data ingestion in TypeScript applications, focusing on improved document handling, private file uploads, and modularity. Key changes include refined file upload processes, extended metadata management, and the introduction of new dependencies to facilitate these improvements. Overall, these modifications streamline operations, bolster security, and elevate the functionality and usability of the system. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant UploadService
participant LlamaCloudFileService
User->>UploadService: Upload file (filename, raw data)
UploadService->>LlamaCloudFileService: Upload file to LlamaCloud
LlamaCloudFileService-->>UploadService: Confirmation of upload
UploadService-->>User: Upload successful message
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
templates/types/streaming/nextjs/app/api/chat/config/llamacloud/route.ts
Outdated
Show resolved
Hide resolved
templates/types/streaming/express/src/controllers/chat-config.controller.ts
Outdated
Show resolved
Hide resolved
templates/components/llamaindex/typescript/streaming/service.ts
Outdated
Show resolved
Hide resolved
templates/components/llamaindex/typescript/streaming/service.ts
Outdated
Show resolved
Hide resolved
templates/components/vectordbs/typescript/llamacloud/generate.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Outside diff range, codebase verification and nitpick comments (6)
templates/components/llamaindex/typescript/streaming/helpers.ts (1)
14-14
: Consider logging a message if the file already exists.Currently, the function silently returns if the file exists. Adding a log message can help in debugging and understanding the flow.
- if (fs.existsSync(downloadedPath)) return; + if (fs.existsSync(downloadedPath)) { + console.log(`File already exists at ${downloadedPath}`); + return; + }templates/types/streaming/nextjs/app/api/chat/upload/route.ts (1)
13-17
: Enhance input validation error message.The error message should clearly indicate that both
base64
andfilename
are required. Consider rephrasing for clarity.- { error: "base64 and filename is required in the request body" }, + { error: "Both 'base64' and 'filename' are required in the request body." },templates/components/llamaindex/typescript/documents/helper.ts (2)
14-25
: Consider parameterizing theprivate
flag.The
private
flag is hardcoded as"true"
, which might limit flexibility. Consider passing it as a parameter if there are scenarios where this value could change.- private: "true", // to separate private uploads from public documents + private: isPrivate, // to separate private uploads from public documents
Line range hint
38-55
:
Enhance error handling for file system operations.Consider adding try-catch blocks around file system operations to handle potential errors, such as permission issues or disk space limitations.
try { if (!fs.existsSync(UPLOADED_FOLDER)) { fs.mkdirSync(UPLOADED_FOLDER, { recursive: true }); } await fs.promises.writeFile(filepath, fileBuffer); } catch (error) { console.error(`Failed to save document: ${error.message}`); throw new Error("Failed to save document"); }templates/types/streaming/nextjs/package.json (1)
15-15
: Unused Dependency:@llamaindex/cloud
The
@llamaindex/cloud
package is listed in thepackage.json
files but is not used in the codebase. Consider removing it unless it is needed for future development or indirectly used by other tools.
templates/types/streaming/nextjs/package.json
templates/types/streaming/express/package.json
Analysis chain
Verify the necessity and usage of
@llamaindex/cloud
.Ensure that the addition of
@llamaindex/cloud
is necessary and that its features are utilized in the project. Verify compatibility with existing dependencies.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of `@llamaindex/cloud` in the codebase. # Test: Search for the import and usage of `@llamaindex/cloud`. Expect: At least one occurrence. rg --type ts -A 5 $'@llamaindex/cloud'Length of output: 37
Script:
#!/bin/bash # Description: Verify the usage of `@llamaindex/cloud` in the entire codebase. # Test: Search for the import and usage of `@llamaindex/cloud` in all file types. rg '@llamaindex/cloud' # Test: Search for dynamic imports or mentions in configuration files. rg 'import\(.+@llamaindex/cloud.+\)' --type js rg '@llamaindex/cloud' --type jsonLength of output: 432
templates/components/llamaindex/typescript/streaming/events.ts (1)
120-137
: Add logging for better traceability.Consider adding logging to capture the constructed URLs and any decisions made based on metadata values. This can aid in debugging and understanding the flow.
console.log(`Constructed URL: ${url}`);
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (16)
- .changeset/violet-islands-collect.md (1 hunks)
- helpers/env-variables.ts (1 hunks)
- templates/components/llamaindex/typescript/documents/helper.ts (2 hunks)
- templates/components/llamaindex/typescript/documents/pipeline.ts (1 hunks)
- templates/components/llamaindex/typescript/documents/upload.ts (1 hunks)
- templates/components/llamaindex/typescript/streaming/events.ts (3 hunks)
- templates/components/llamaindex/typescript/streaming/helpers.ts (1 hunks)
- templates/components/vectordbs/typescript/llamacloud/generate.ts (2 hunks)
- templates/components/vectordbs/typescript/llamacloud/index.ts (1 hunks)
- templates/components/vectordbs/typescript/llamacloud/queryFilter.ts (1 hunks)
- templates/types/streaming/express/package.json (1 hunks)
- templates/types/streaming/express/src/controllers/chat-config.controller.ts (1 hunks)
- templates/types/streaming/express/src/controllers/chat-upload.controller.ts (1 hunks)
- templates/types/streaming/nextjs/app/api/chat/config/llamacloud/route.ts (1 hunks)
- templates/types/streaming/nextjs/app/api/chat/upload/route.ts (2 hunks)
- templates/types/streaming/nextjs/package.json (1 hunks)
Files skipped from review due to trivial changes (1)
- helpers/env-variables.ts
Additional comments not posted (16)
.changeset/violet-islands-collect.md (1)
1-5
: Changeset looks good.The changeset is properly formatted and aligns with the PR objectives.
templates/types/streaming/express/src/controllers/chat-upload.controller.ts (2)
6-13
: Enhanced validation looks good.The changes improve input validation by ensuring both
base64
andfilename
are required. The error message is clear and informative.
17-17
: Verify the function usage with the updated signature.The call to
uploadDocument
now includesfilename
. Ensure that all calls to this function across the codebase are updated accordingly.Verification successful
All calls to
uploadDocument
include thefilename
parameter. The function usage has been correctly updated across the codebase to match the new signature.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify all calls to `uploadDocument` include the `filename` parameter. # Test: Search for the function usage. Expect: Only occurrences with the new signature. rg --type typescript -A 5 $'uploadDocument'Length of output: 77
Script:
#!/bin/bash # Description: Verify all calls to `uploadDocument` include the `filename` parameter. # Find all TypeScript files and search for the function usage. fd --extension ts --exec rg 'uploadDocument' {}Length of output: 371
templates/types/streaming/nextjs/app/api/chat/config/llamacloud/route.ts (2)
8-14
: Validation for API key is effective.The check for
LLAMA_CLOUD_API_KEY
ensures that the API does not proceed without necessary credentials. The error message is clear and appropriate.
16-16
: Instantiation ofLLamaCloudFileService
is clear.Using a local variable for
LLamaCloudFileService
enhances code clarity and maintainability.templates/components/llamaindex/typescript/documents/pipeline.ts (2)
13-26
: Unified processing path: Ensure compatibility.The removal of conditional logic for
LlamaCloudIndex
enforces a single processing path using the ingestion pipeline. Verify that this change aligns with the intended functionality and does not introduce regressions.
10-10
: Signature change: Verify impact on codebase.The function
runPipeline
now only acceptsVectorStoreIndex
. Ensure that all invocations of this function in the codebase are updated to reflect this change.templates/components/vectordbs/typescript/llamacloud/index.ts (1)
21-21
: Ensure environment variable is set.The
organizationId
is sourced fromprocess.env.LLAMA_CLOUD_ORGANIZATION_ID
. Verify that this environment variable is correctly set in all deployment environments.templates/components/vectordbs/typescript/llamacloud/queryFilter.ts (1)
15-15
: Key change: Ensure consistent usage.The key in
privateDocumentsFilter
has changed from"doc_id"
to"file_id"
. Verify that all related logic and documentation reflect this change.Verification successful
Ensure Consistent Usage of
file_id
Across CodebaseThe
file_id
key is used consistently in both TypeScript and Python files related to LlamaCloud. Here are the locations wherefile_id
is referenced:
TypeScript:
templates/components/vectordbs/typescript/llamacloud/queryFilter.ts
Python:
templates/components/vectordbs/python/llamacloud/service.py
templates/components/vectordbs/python/llamacloud/query_filter.py
Ensure that any related logic or documentation is updated to reflect the change from
doc_id
tofile_id
.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify consistent usage of `file_id` across the codebase. # Test: Search for any references to `file_id` to ensure consistent usage. rg --type typescript $'file_id'Length of output: 65
Script:
#!/bin/bash # Description: Verify consistent usage of `file_id` across the codebase. # Test: Search for any references to `file_id` in TypeScript files to ensure consistent usage. rg 'file_id' --glob '*.ts'Length of output: 231
Script:
#!/bin/bash # Description: Verify consistent usage of `file_id` across the entire codebase. # Test: Search for any references to `file_id` in all files to ensure consistent usage. rg 'file_id'Length of output: 886
templates/components/vectordbs/typescript/llamacloud/generate.ts (1)
18-28
: Consider performance implications of processing documents individually.The shift from batch processing to individual document handling might impact performance, especially with large datasets. Consider evaluating the performance and optimizing if necessary.
templates/types/streaming/nextjs/app/api/chat/upload/route.ts (1)
31-31
: VerifyuploadDocument
function usage.Ensure that the
uploadDocument
function correctly handles the newfilename
parameter.Verification successful
uploadDocument
Function Correctly Handlesfilename
ParameterThe
uploadDocument
function is correctly defined to accept and utilize thefilename
parameter. Its usage in the codebase reflects this change appropriately, with no issues found in its handling.
- Function Definition: Located in
templates/components/llamaindex/typescript/documents/upload.ts
.- Usages:
templates/types/streaming/express/src/controllers/chat-upload.controller.ts
templates/types/streaming/nextjs/app/api/chat/upload/route.ts
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the `uploadDocument` function usage with the new `filename` parameter. # Test: Search for the `uploadDocument` function definition and its usage. rg --type typescript -A 5 $'uploadDocument'Length of output: 77
Script:
#!/bin/bash # Description: Verify the `uploadDocument` function usage with the new `filename` parameter. # Test: Search for the `uploadDocument` function definition and its usage across all files. rg -A 5 'uploadDocument'Length of output: 2591
templates/components/llamaindex/typescript/documents/upload.ts (2)
8-8
: Verify the integration of thefilename
parameter.Ensure that the
filename
parameter is correctly handled throughout the function.Verification successful
Integration of
filename
parameter is correctly implemented.The
filename
parameter is properly used within theuploadDocument
function, particularly in creating aFile
object. This integration aligns with its intended purpose. No issues found.
uploadDocument
function intemplates/components/llamaindex/typescript/documents/upload.ts
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the integration of the `filename` parameter in the `uploadDocument` function. # Test: Search for the `uploadDocument` function definition and its usage. ast-grep --lang typescript --pattern $'function uploadDocument($_, filename: $_, $_) { $$$ }'Length of output: 94
Script:
#!/bin/bash # Description: Search for the `uploadDocument` function and check for `filename` parameter usage. # Search for the `uploadDocument` function definition. rg 'function uploadDocument' -A 5 # Search for the usage of `filename` parameter in the codebase. rg 'filename' --type=typescriptLength of output: 665
Script:
#!/bin/bash # Description: Correct the search for the usage of `filename` parameter in TypeScript files. # Search for the usage of `filename` parameter in TypeScript files. rg 'filename' --type=tsLength of output: 4852
15-27
: Review the use ofLLamaCloudFileService
.Ensure that the
LLamaCloudFileService
is correctly used to handle file uploads forLlamaCloudIndex
.templates/types/streaming/express/package.json (1)
27-27
: Verify the addition of@llamaindex/cloud
dependency.Ensure that the addition of
@llamaindex/cloud
is necessary and aligns with the project's requirements.templates/components/llamaindex/typescript/documents/helper.ts (1)
Line range hint
27-36
:
LGTM!The function correctly handles document loading based on MIME type and includes appropriate error handling.
templates/components/llamaindex/typescript/streaming/events.ts (1)
13-14
: LGTM!The use of an instance of
LLamaCloudFileService
enhances modularity and encapsulation.
templates/types/streaming/express/src/controllers/chat-config.controller.ts
Outdated
Show resolved
Hide resolved
templates/components/llamaindex/typescript/streaming/helpers.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- templates/components/vectordbs/typescript/llamacloud/generate.ts (2 hunks)
Additional comments not posted (3)
templates/components/vectordbs/typescript/llamacloud/generate.ts (3)
2-2
: Import change approved.The import of
LLamaCloudFileService
is consistent with the new functionality introduced in the code.
Line range hint
28-30
:
Verify environment setup functions.Ensure that
checkRequiredEnvVars
andinitSettings
correctly set up the environment for the script execution.Verification successful
Environment Setup Functions Verified
Both
checkRequiredEnvVars
andinitSettings
are implemented and correctly integrated into the environment setup process. ThecheckRequiredEnvVars
function ensures necessary environment variables are present, whileinitSettings
initializes settings related to model providers. No issues were found with these functions.
checkRequiredEnvVars
is implemented inshared.ts
files across different components.initSettings
is implemented insettings.ts
files for both Express and Next.js.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the implementation of environment setup functions. # Test: Search for the implementation of `checkRequiredEnvVars` and `initSettings`. Expect: Proper environment setup. ast-grep --lang typescript --pattern $'function checkRequiredEnvVars() { $$$ }' ast-grep --lang typescript --pattern $'function initSettings() { $$$ }'Length of output: 11488
Script:
#!/bin/bash # Description: Search for the implementation of `initSettings` function. # Test: Locate the `initSettings` function definition. ast-grep --lang typescript --pattern $'function initSettings() { $$$ }'Length of output: 74
Script:
#!/bin/bash # Description: Search for any settings initialization functions or methods. # Test: Locate any function or method related to settings initialization. rg --type=typescript --ignore-case 'init.*settings' -A 3Length of output: 93
Script:
#!/bin/bash # Description: Correct the search for any settings initialization functions or methods. # Test: Locate any function or method related to settings initialization with correct file type. rg --type ts --ignore-case 'init.*settings' -A 3Length of output: 10060
12-14
: Verify the retrieval ofprojectId
andpipelineId
.Ensure that the methods
getProjectId
andgetPipelineId
are correctly implemented and return valid identifiers.
templates/components/vectordbs/typescript/llamacloud/generate.ts
Outdated
Show resolved
Hide resolved
Note: upgrade LITS to use LlamaCloudFileService. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- templates/types/streaming/express/src/controllers/chat-config.controller.ts (2 hunks)
Files skipped from review as they are similar to previous changes (1)
- templates/types/streaming/express/src/controllers/chat-config.controller.ts
templates/types/streaming/nextjs/app/api/chat/config/llamacloud/route.ts
Outdated
Show resolved
Hide resolved
templates/components/vectordbs/typescript/llamacloud/generate.ts
Outdated
Show resolved
Hide resolved
templates/components/vectordbs/typescript/llamacloud/generate.ts
Outdated
Show resolved
Hide resolved
templates/components/llamaindex/typescript/streaming/helpers.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Outside diff range, codebase verification and nitpick comments (1)
templates/components/llamaindex/typescript/streaming/file.ts (1)
14-14
: Consider logging a message when the file already exists.Adding a log message when the file already exists can provide better insight into the function's behavior during execution.
- if (fs.existsSync(downloadedPath)) return; + if (fs.existsSync(downloadedPath)) { + console.log(`File already exists: ${downloadedPath}`); + return; + }
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (7)
- helpers/env-variables.ts (1 hunks)
- templates/components/llamaindex/typescript/documents/upload.ts (1 hunks)
- templates/components/llamaindex/typescript/streaming/events.ts (5 hunks)
- templates/components/llamaindex/typescript/streaming/file.ts (1 hunks)
- templates/components/vectordbs/typescript/llamacloud/generate.ts (2 hunks)
- templates/types/streaming/express/src/controllers/chat-config.controller.ts (2 hunks)
- templates/types/streaming/nextjs/app/api/chat/config/llamacloud/route.ts (1 hunks)
Files skipped from review due to trivial changes (1)
- helpers/env-variables.ts
Files skipped from review as they are similar to previous changes (2)
- templates/types/streaming/express/src/controllers/chat-config.controller.ts
- templates/types/streaming/nextjs/app/api/chat/config/llamacloud/route.ts
Additional comments not posted (2)
templates/components/vectordbs/typescript/llamacloud/generate.ts (1)
17-23
: Consider error handling for file operations and service calls.The current implementation lacks error handling for file reading and service interactions, which could lead to unhandled exceptions.
templates/components/llamaindex/typescript/streaming/events.ts (1)
144-159
: Add error handling for file download operations.Consider wrapping the download operation in a try-catch block to handle potential errors, such as network issues or invalid URLs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Outside diff range, codebase verification and nitpick comments (1)
templates/components/vectordbs/typescript/llamacloud/generate.ts (1)
3-3
: Ensure Complete Removal ofLlamaCloudIndex
ReferencesThe
LLamaCloudFileService
is correctly implemented across multiple files, including the one under review. However, references toLlamaCloudIndex
still exist in several files. Please ensure that all instances ofLlamaCloudIndex
are reviewed and updated if necessary.
templates/components/vectordbs/typescript/llamacloud/index.ts
templates/components/vectordbs/python/llamacloud/index.py
templates/types/streaming/fastapi/app/api/services/file.py
templates/components/llamaindex/typescript/documents/upload.ts
Analysis chain
Verify the correct usage of
LLamaCloudFileService
.Ensure that the
LLamaCloudFileService
is correctly implemented and that all previous references toLlamaCloudIndex
have been updated accordingly.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that `LLamaCloudFileService` is correctly used and `LlamaCloudIndex` is no longer referenced. # Test: Search for `LLamaCloudFileService` usage. Expect: Correct usage in the file. rg --type typescript 'LLamaCloudFileService' -A 5 # Test: Ensure `LlamaCloudIndex` is not referenced anymore. Expect: No matches. rg --type typescript 'LlamaCloudIndex'Length of output: 158
Script:
#!/bin/bash # Description: Verify that `LLamaCloudFileService` is correctly used and `LlamaCloudIndex` is no longer referenced. # Test: Search for `LLamaCloudFileService` usage. Expect: Correct usage in the file. rg 'LLamaCloudFileService' -A 5 # Test: Ensure `LlamaCloudIndex` is not referenced anymore. Expect: No matches. rg 'LlamaCloudIndex'Length of output: 12974
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- templates/components/llamaindex/typescript/streaming/events.ts (5 hunks)
- templates/components/vectordbs/typescript/llamacloud/generate.ts (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- templates/components/llamaindex/typescript/streaming/events.ts
Additional comments not posted (1)
templates/components/vectordbs/typescript/llamacloud/generate.ts (1)
12-24
: LGTM: Efficient directory traversal withwalk
.The
walk
function efficiently traverses directories and yields file paths. The use of async generator is appropriate here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- templates/types/streaming/express/package.json (1 hunks)
- templates/types/streaming/nextjs/package.json (1 hunks)
Files skipped from review as they are similar to previous changes (2)
- templates/types/streaming/express/package.json
- templates/types/streaming/nextjs/package.json
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores