generated from amazon-archives/__template_MIT-0
-
Notifications
You must be signed in to change notification settings - Fork 1k
New serverless pattern - eventbridge-scheduled-stepfunction-bedrock-kb-sync #2825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
julianwood
merged 12 commits into
aws-samples:main
from
shansv:shansv-feature-eventbridge-scheduled-stepfunction-bedrock-kb-sync
Nov 10, 2025
Merged
Changes from 4 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
dfc393c
eventbridge-scheduled-stepfunction-bedrock-kb-sync pattern
shansv 8a3b59c
eventbridge-scheduled-stepfunction-bedrock-kb-sync pattern
shansv ecd1e2c
removing test files
shansv 4bf852b
removing test files
shansv 995ed9c
Update app.py
shansv d59789c
cdk build fixes
shansv 5e04f36
cdk build fixes
shansv 6900a9a
diabled s3 access log
shansv ea1075b
addressed all review comments
shansv f16b557
addressed all review comments
shansv 1065ea5
publishing file
ellisms ac401e1
readme file update
shansv File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
148 changes: 148 additions & 0 deletions
148
eventbridge-scheduled-stepfunction-bedrock-kb-sync/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,148 @@ | ||
| # Bedrock Knowledge Base Synchronization Flow with EventBridge Scheduler | ||
|
|
||
| This pattern demonstrates an automated synchronization process for Amazon Bedrock Knowledge Bases using EventBridge Scheduler and Step Functions. The solution enables periodic synchronization of data sources, ensuring your Knowledge Base stays up-to-date with the latest content. | ||
ellisms marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| Learn more about this pattern at Serverless Land Patterns: https://serverlessland.com/patterns/eventbridge-scheduled-stepfunction-bedrock-kb-sync | ||
|
|
||
|
|
||
| Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example. | ||
|
|
||
| ## Architecture | ||
|  | ||
|
|
||
| ## Requirements | ||
|
|
||
| * [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources. | ||
| * [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured | ||
| * [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) | ||
| * [AWS CDK](https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html) (AWS CDK) installed | ||
|
|
||
| ## Deployment Instructions | ||
|
|
||
| 1. Create a new directory, navigate to that directory in a terminal and clone the GitHub repository: | ||
| ``` | ||
| git clone https://github.com/aws-samples/serverless-patterns | ||
| ``` | ||
| 2. Change directory to the pattern directory: | ||
| ``` | ||
| cd serverless-patterns/eventbridge-scheduled-stepfunction-bedrock-kb-sync/cdk | ||
| ``` | ||
| 3. Setup local developer environment and dependencies: | ||
| ``` | ||
| make bootstrap-venv | ||
| source .venv/bin/activate | ||
| ``` | ||
| 4. From the command line, configure AWS CDK: | ||
| ```bash | ||
| cdk bootstrap | ||
| ``` | ||
| 5. From the command line, use AWS CDK to deploy the AWS resources for the pattern as specified in the `lib/appsync-eventbridge-datasource-stack.ts` file: | ||
| ```bash | ||
| cdk deploy --all | ||
| ``` | ||
| 6. This command will take sometime to run. After successfully completing, the below stacks deployed. | ||
| ``` | ||
| KbRoleStack | ||
| CommonLambdaLayerStack | ||
| OSSStack | ||
| KbSyncPipelineStack | ||
| KbInfraStack | ||
| ``` | ||
|
|
||
| ## How it works | ||
|
|
||
| Here's a detailed summary of your serverless pattern for automated Knowledge Base synchronization: | ||
|
|
||
| Pattern Overview: This is a scheduled, serverless workflow that automates the synchronization of Bedrock Knowledge Bases using AWS EventBridge Scheduler, AWS Step Functions, and Amazon Bedrock. | ||
ellisms marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| Key Components: | ||
| a) EventBridge Scheduler | ||
| - Runs every 15 minutes | ||
| - Triggers the Step Function workflow | ||
| - Passes Bedrock Knowledge Base ID as input parameter | ||
ellisms marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - Enables consistent and automated synchronization | ||
|
|
||
| b) Step Functions Workflow | ||
| -Main Flow: | ||
| - Receives Knowledge Base ID from EventBridge | ||
| - Orchestrates the entire synchronization process | ||
| - Handles error scenarios and retries | ||
| - Manages parallel processing of multiple data sources | ||
|
|
||
| Step 1: Data Source Retrieval | ||
| Queries all associated data sources for the given Knowledge Base ID | ||
| Prepares the list for processing | ||
| Validates data source configurations | ||
|
|
||
| Step 2: Map State for Parallel Processing | ||
| Iterates through each data source | ||
| Processes multiple data sources concurrently | ||
| Manages state for each sync operation | ||
|
|
||
| Step 3: Synchronization Process (For each data source) | ||
| Initiates the sync operation | ||
| Monitors sync status | ||
| Handles completion and failures | ||
| Reports sync results | ||
|
|
||
| Step 4: Status Reporting | ||
| Aggregates sync results | ||
| Records success/failure metrics | ||
| Generates summary reports | ||
|
|
||
| ## Testing | ||
|
|
||
| Step 1: Upload Sample Documents to S3 | ||
ellisms marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - Navigate to Amazon S3 in AWS Console | ||
| - Locate the bucket named kb-data-source-{account-id} | ||
| - Upload your sample documents to this bucket | ||
|
|
||
| Step 2: Wait for Scheduler Execution | ||
| - The EventBridge scheduler is configured to run every 15 minutes | ||
| - You can monitor the scheduler in EventBridge console | ||
| Note: The next execution will occur at the next 15-minute interval | ||
|
|
||
| Step 3: Monitor Step Function Execution | ||
| - Navigate to AWS Step Functions console | ||
| - Find your state machine execution | ||
ellisms marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - Monitor the workflow progress through different states | ||
| - Verify successful completion of all steps | ||
|
|
||
| Step 4: Verify Sync Status in Bedrock | ||
ellisms marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - Go to Amazon Bedrock console | ||
| - Navigate to Knowledge Bases | ||
| - Select your Knowledge Base | ||
| - Click on Data Sources | ||
| - Check the Sync History tab | ||
| - Verify the sync status shows as "Completed" | ||
| - Review sync details including: | ||
| Timestamp of sync | ||
| Number of documents processed | ||
| Any errors or warnings | ||
|
|
||
|
|
||
| Step 45: Validation Points | ||
shansv marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - Confirm documents are indexed | ||
| - Check sync completion status | ||
| - Verify no errors in sync history | ||
| - Ensure all uploaded documents are processed | ||
|
|
||
| Troubleshooting | ||
| If sync fails or documents aren't appearing: | ||
|
|
||
| Check S3 bucket permissions | ||
| Review Step Function execution logs | ||
| Verify document format compatibility | ||
| Check Knowledge Base configuration | ||
|
|
||
|  | ||
|
|
||
| ## Delete stack | ||
|
|
||
| ```bash | ||
| cdk destroy --all | ||
| ``` | ||
| ---- | ||
| Copyright 2024 Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
|
|
||
| SPDX-License-Identifier: MIT-0 | ||
184 changes: 184 additions & 0 deletions
184
eventbridge-scheduled-stepfunction-bedrock-kb-sync/cdk/.gitignore
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,184 @@ | ||
| # Byte-compiled / optimized / DLL files | ||
| __pycache__/ | ||
| *.py[cod] | ||
| *$py.class | ||
|
|
||
| # C extensions | ||
| *.so | ||
|
|
||
| # Distribution / packaging | ||
| .Python | ||
| build/ | ||
| develop-eggs/ | ||
| dist/ | ||
| downloads/ | ||
| eggs/ | ||
| .eggs/ | ||
| lib/ | ||
| lib64/ | ||
| parts/ | ||
| sdist/ | ||
| var/ | ||
| wheels/ | ||
| share/python-wheels/ | ||
| *.egg-info/ | ||
| .installed.cfg | ||
| *.egg | ||
| MANIFEST | ||
|
|
||
| # PyInstaller | ||
| # Usually these files are written by a python script from a template | ||
| # before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
| *.manifest | ||
| *.spec | ||
|
|
||
| # Installer logs | ||
| pip-log.txt | ||
| pip-delete-this-directory.txt | ||
|
|
||
| # Unit test / coverage reports | ||
| htmlcov/ | ||
| .tox/ | ||
| .nox/ | ||
| .coverage | ||
| .coverage.* | ||
| .cache | ||
| nosetests.xml | ||
| coverage.xml | ||
| *.cover | ||
| *.py,cover | ||
| .hypothesis/ | ||
| .pytest_cache/ | ||
| cover/ | ||
|
|
||
| # Translations | ||
| *.mo | ||
| *.pot | ||
|
|
||
| # Django stuff: | ||
| *.log | ||
| local_settings.py | ||
| db.sqlite3 | ||
| db.sqlite3-journal | ||
|
|
||
| # Flask stuff: | ||
| instance/ | ||
| .webassets-cache | ||
|
|
||
| # Scrapy stuff: | ||
| .scrapy | ||
|
|
||
| # Sphinx documentation | ||
| docs/_build/ | ||
|
|
||
| # PyBuilder | ||
| .pybuilder/ | ||
| target/ | ||
|
|
||
| # Jupyter Notebook | ||
| .ipynb_checkpoints | ||
|
|
||
| # IPython | ||
| profile_default/ | ||
| ipython_config.py | ||
|
|
||
| # pyenv | ||
| # For a library or package, you might want to ignore these files since the code is | ||
| # intended to run in multiple environments; otherwise, check them in: | ||
| # .python-version | ||
|
|
||
| # pipenv | ||
| # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. | ||
| # However, in case of collaboration, if having platform-specific dependencies or dependencies | ||
| # having no cross-platform support, pipenv may install dependencies that don't work, or not | ||
| # install all needed dependencies. | ||
| #Pipfile.lock | ||
|
|
||
| # UV | ||
| # Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control. | ||
| # This is especially recommended for binary packages to ensure reproducibility, and is more | ||
| # commonly ignored for libraries. | ||
| #uv.lock | ||
|
|
||
| # poetry | ||
| # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. | ||
| # This is especially recommended for binary packages to ensure reproducibility, and is more | ||
| # commonly ignored for libraries. | ||
| # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control | ||
| #poetry.lock | ||
|
|
||
| # pdm | ||
| # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. | ||
| #pdm.lock | ||
| # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it | ||
| # in version control. | ||
| # https://pdm.fming.dev/latest/usage/project/#working-with-version-control | ||
| .pdm.toml | ||
| .pdm-python | ||
| .pdm-build/ | ||
|
|
||
| # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm | ||
| __pypackages__/ | ||
|
|
||
| # Celery stuff | ||
| celerybeat-schedule | ||
| celerybeat.pid | ||
|
|
||
| # SageMath parsed files | ||
| *.sage.py | ||
|
|
||
| # Environments | ||
| .env | ||
| .venv | ||
| env/ | ||
| venv/ | ||
| ENV/ | ||
| env.bak/ | ||
| venv.bak/ | ||
|
|
||
| # Spyder project settings | ||
| .spyderproject | ||
| .spyproject | ||
|
|
||
| # Rope project settings | ||
| .ropeproject | ||
|
|
||
| # mkdocs documentation | ||
| /site | ||
|
|
||
| # mypy | ||
| .mypy_cache/ | ||
| .dmypy.json | ||
| dmypy.json | ||
|
|
||
| # Pyre type checker | ||
| .pyre/ | ||
|
|
||
| # pytype static type analyzer | ||
| .pytype/ | ||
|
|
||
| # Cython debug symbols | ||
| cython_debug/ | ||
|
|
||
| # PyCharm | ||
| # JetBrains specific template is maintained in a separate JetBrains.gitignore that can | ||
| # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore | ||
| # and can be added to the global gitignore or merged into this file. For a more nuclear | ||
| # option (not recommended) you can uncomment the following to ignore the entire idea folder. | ||
| #.idea/ | ||
|
|
||
| # Ruff stuff: | ||
| .ruff_cache/ | ||
|
|
||
| # PyPI configuration file | ||
| .pypirc | ||
|
|
||
| # CDK asset staging directory | ||
| .cdk.staging | ||
| cdk.out | ||
|
|
||
| # Misc | ||
| unittests.xml | ||
|
|
||
| .coverage | ||
| cov.xml |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.