fix: fix issue where cdk bootstrap failed and add response streaming feature (toggle off). Close #122 #134
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR includes @sperka PR #130. The PR has been tested on different use cases and fix a few issues:
The PR also:
Issues found in the streaming feature, so add a feature flag on the feature to toggle it off:
To toggle on: add ?features=streaming query string to the url
Pending To-do items:
lambda-powertools
to v2Here is a copied of original PR description:
This PR introduces streaming support and was also targeting a bit of a cleanup/refresh.
latest
orlatest-that-doesn't-break-anything
type-safe-api
create-message
handler to supportREST
andWS
There are a few things that should be added or changed (refer to
// TODO
s in the code):langchain@0.0.194
version is used, there were huge improvements so updating to latest should be inevitable - however, this includes quite a bit of effortlambda-powertools
v2 is out, but has breaking changesqaChain
definition and how streaming is handled, especially based on the models used. With the current generic approach, streaming is not happening, there is only one chunk returned (which is the actual result) instead of incremental answer (aka tokens). This may be an issue because of the langchain version, or something else. That part needs to be tested, both for bedrock LLMs and Sagemaker-hosted LLMs.operation
value can be any string value which should be restricted toenum
, and maybe other fields could be limited as well -- needs thorough reviewDoes this PR introduce a breaking change?
No breaking change. The streaming feature is under a feature toggle.
Related Issues/Discussion
How Has This Been Tested?
Screenshots (if appropriate)
PR Checklist
IMPORTANT: Please review the CONTRIBUTING.md file for detailed contributing guidelines.