Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnstructuredLoader: add timeout option #1869

Closed
wants to merge 0 commits into from

Conversation

thundo
Copy link
Contributor

@thundo thundo commented Jul 5, 2023

Fixes #1856

Adds a per-request timeout option to the UnstructuredLoader, allowing a timeout different from 5 minutes to be passed to the constructor.

@vercel
Copy link

vercel bot commented Jul 5, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
langchainjs-docs ✅ Ready (Inspect) Visit Preview Jul 7, 2023 2:34pm

@thundo thundo changed the title Unstructured timeout fix UnstructuredLoader: add timeout flag Jul 5, 2023
@thundo thundo changed the title UnstructuredLoader: add timeout flag UnstructuredLoader: add timeout option Jul 5, 2023
@thundo thundo force-pushed the unstructured-timeout-fix branch from 875715c to 96ea42d Compare July 6, 2023 13:37
Copy link
Collaborator

@jacoblee93 jacoblee93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this - would really like to avoid adding an additional dependency. Can we do this in base fetch?

@@ -791,6 +791,7 @@
"openapi-types": "^12.1.3",
"p-queue": "^6.6.2",
"p-retry": "4",
"undici": "^5.22.1",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this without the additional dependency?

@thundo
Copy link
Contributor Author

thundo commented Jul 7, 2023

Thanks for submitting this - would really like to avoid adding an additional dependency. Can we do this in base fetch?

Not that I'm aware of. Since Node18 node fetch is internally implemented with undici (mantained more or less by the same node core team) and for the dispatcher you need to instantiate at least an Agent, which is an exported undici class.

See this discussion for reference.

@jacoblee93
Copy link
Collaborator

Unfortunately in this case we need to support more than Nodr (Edge functions, CF Workers, Deno).

@thundo
Copy link
Contributor Author

thundo commented Jul 7, 2023

Unfortunately in this case we need to support more than Nodr (Edge functions, CF Workers, Deno).

Totally missed the portability concerns... I'll scout about it and report back.

@thundo thundo marked this pull request as draft July 7, 2023 14:20
@thundo thundo closed this Jul 7, 2023
@thundo thundo force-pushed the unstructured-timeout-fix branch from fd62bce to 40215d3 Compare July 7, 2023 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unstructured: headers timeout
2 participants