S3 Directory Document Loading Component #2818

slaplante-raft · 2024-07-17T13:39:26Z

This PR introduces a new component which will allow a user to load multiple documents from an S3 bucket. There are optional parameters Server URL and Prefix. The component duplicates the functionality of the filesystem directory document loading component.

When connecting to MinIO buckets or a local S3 bucket, the Server URL will need to be provided.
When you want to filter only entries under a specific directory you would use the prefix option (hierarchy in s3 is flat so if there was a directoryB in directoryA, you would specify directoryA/directoryB to only load contents of directoryB) This also defaults to recursive loading. Another option can be added to limit that if needed.

Tested with a MinIO bucket containing pdf files in different directories. Verified with no prefix(download entire bucket), a prefix containing another directory and a prefix with no directory.

In addition, this was tested with a global s3 bucket (When the Server URL is not provided)

HenryHengZJ · 2024-07-21T22:45:18Z

awesomee thank you so much!

add placeholder for prefix

Scott Laplante added 2 commits July 17, 2024 09:18

Add new S3Directory component

ea92f46

Add Additional Metadata and Omit Metadata Keys parameters

8dec82a

This was referenced Jul 17, 2024

[FEATURE] Support loading multiple documents from an S3 compatible object storage bucket #2778

Closed

[FEATURE] S3 directory with multiple file support #2806

Closed

HenryHengZJ approved these changes Jul 21, 2024

View reviewed changes

Update S3Directory.ts

28c37aa

add placeholder for prefix

HenryHengZJ merged commit 34d0e43 into FlowiseAI:main Jul 21, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3 Directory Document Loading Component #2818

S3 Directory Document Loading Component #2818

slaplante-raft commented Jul 17, 2024

HenryHengZJ commented Jul 21, 2024

S3 Directory Document Loading Component #2818

S3 Directory Document Loading Component #2818

Conversation

slaplante-raft commented Jul 17, 2024

HenryHengZJ commented Jul 21, 2024