Skip to content

Commit 7606436

Browse files
authored
rework sample for dev hub (#1)
* rework sample for dev hub * remove readme section, fix action * change master to main in action * fix terraform version in action * change default commands * add dummy credentials for terraform * more dummy credentials * add debug output * remove duplicate terraform init/plan/apply * fix broken pipe * disable pipefail and debug output * add debug output * add more troubleshooting output * more troubleshooting output * access only stdout of command * fix terraform step * remove debug output * add LICENSE, cleanup * fix typos/copy paste errors
1 parent 7c49165 commit 7606436

File tree

8 files changed

+280
-50
lines changed

8 files changed

+280
-50
lines changed

.github/workflows/main.yml

+78
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
name: Deploy on LocalStack
2+
3+
on:
4+
push:
5+
paths-ignore:
6+
- 'README.md'
7+
branches:
8+
- main
9+
pull_request:
10+
branches:
11+
- main
12+
schedule:
13+
# “At 00:00 on Sunday.”
14+
- cron: "0 0 * * 0"
15+
workflow_dispatch:
16+
17+
jobs:
18+
fuzzy-movies:
19+
name: Setup fuzzy movie application
20+
runs-on: ubuntu-latest
21+
steps:
22+
23+
- name: Checkout
24+
uses: actions/checkout@v3
25+
26+
- name: Setup Python
27+
uses: actions/setup-python@v4
28+
with:
29+
python-version: '3.9'
30+
31+
- uses: hashicorp/setup-terraform@v2
32+
with:
33+
terraform_version: 1.4.5
34+
terraform_wrapper: false
35+
- name: Setup tflocal
36+
run: |
37+
pip install terraform-local
38+
39+
- name: Start LocalStack
40+
env:
41+
LOCALSTACK_API_KEY: ${{ secrets.LOCALSTACK_API_KEY }}
42+
run: |
43+
pip install localstack awscli-local[ver1]
44+
docker pull localstack/localstack-pro:latest
45+
# Start LocalStack in the background
46+
DEBUG=1 localstack start -d
47+
# Wait 15 seconds for the LocalStack container to become ready before timing out
48+
echo "Waiting for LocalStack startup..."
49+
localstack wait -t 15
50+
echo "Startup complete"
51+
52+
- name: Run the application
53+
run: ./run.sh
54+
55+
- name: Send a Slack notification
56+
if: failure() || github.event_name != 'pull_request'
57+
uses: ravsamhq/notify-slack-action@v2
58+
with:
59+
status: ${{ job.status }}
60+
token: ${{ secrets.GITHUB_TOKEN }}
61+
notification_title: "{workflow} has {status_message}"
62+
message_format: "{emoji} *{workflow}* {status_message} in <{repo_url}|{repo}>"
63+
footer: "Linked Repo <{repo_url}|{repo}> | <{run_url}|View Workflow run>"
64+
notify_when: "failure"
65+
env:
66+
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
67+
68+
- name: Generate a Diagnostic Report
69+
if: failure()
70+
run: |
71+
curl -s localhost:4566/_localstack/diagnose | gzip -cf > diagnose.json.gz
72+
73+
- name: Upload the Diagnostic Report
74+
if: failure()
75+
uses: actions/upload-artifact@v3
76+
with:
77+
name: diagnose.json.gz
78+
path: ./diagnose.json.gz

CONTRIBUTING.md

+54
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Contributing Guidelines
2+
3+
Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional
4+
documentation, we greatly value feedback and contributions from our community.
5+
6+
Please read through this document before submitting any issues or pull requests to ensure we have all the necessary
7+
information to effectively respond to your bug report or contribution.
8+
9+
10+
## Reporting Bugs/Feature Requests
11+
12+
We welcome you to use the GitHub issue tracker to report bugs or suggest features.
13+
14+
When filing an issue, please check existing open or recently closed issues to make sure somebody else hasn't already
15+
reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
16+
17+
* A reproducible test case or series of steps
18+
* The version of our code being used
19+
* Any modifications you've made relevant to the bug
20+
* Anything unusual about your environment or deployment
21+
22+
23+
## Contributing via Pull Requests
24+
Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure the following:
25+
26+
1. You are working against the latest source on the *main* branch.
27+
2. You check existing open and recently merged pull requests to make sure someone else hasn't addressed the problem already.
28+
3. You open an issue to discuss any significant work - we would hate to waste your time.
29+
30+
To send us a pull request, please:
31+
32+
1. Fork the repository.
33+
2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
34+
3. Ensure local tests pass.
35+
4. Commit to your fork using clear commit messages.
36+
5. Send us a pull request, answering any default questions in the pull request interface.
37+
6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
38+
39+
GitHub provides additional documents on [forking a repository](https://help.github.com/articles/fork-a-repo/) and
40+
[creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41+
42+
43+
## Finding contributions to work on
44+
Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start.
45+
46+
47+
## Code of Conduct
48+
Please review the adopted [code of conduct](CODE_OF_CONDUCT.md) and make sure you follow the guidelines.
49+
50+
51+
## Licensing
52+
53+
See the [LICENSE](LICENSE) file for our project's licensing. By contributing, you agree that your contributions will be licensed under the existing license.
54+

LICENSE.txt

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
Copyright (c) 2023 LocalStack
2+
3+
Licensed under the Apache License, Version 2.0 (the "License");
4+
you may not use this file except in compliance with the License.
5+
You may obtain a copy of the License at
6+
7+
http://www.apache.org/licenses/LICENSE-2.0
8+
9+
Unless required by applicable law or agreed to in writing, software
10+
distributed under the License is distributed on an "AS IS" BASIS,
11+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
See the License for the specific language governing permissions and
13+
limitations under the License.

README.md

+116-39
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,122 @@
1-
# Fuzzy Movies
2-
## Lambda, Kinesis, Firehose, ElasticSearch, S3
3-
![Screenshot](./docs/screenshot.png)
1+
# Fuzzy Movie Search - Search application with Lambda, Kinesis, Firehose, ElasticSearch, S3
42

5-
This Hackathon project is an AWS app consisting of:
3+
4+
| Key | Value |
5+
| ------------ | ------------------------------------------------------------------------------------- |
6+
| Environment | <img src="https://img.shields.io/badge/LocalStack-deploys-4D29B4.svg?logo="> |
7+
| Services | Lambda, Kinesis, Firehose, ElasticSearch, S3 |
8+
| Integrations | Terraform, AWS CLI |
9+
| Categories | Serverless; Event-Driven architecture |
10+
| Level | Intermediate |
11+
| GitHub | [Repository link](https://github.com/localstack/fuzzy-movie-search) |
12+
13+
## Introduction
14+
This Fuzzy Search application demonstrates how to set up an S3-hosted website that enables you to fuzzy-search a movie database. The sample application implements the following integration among the various AWS services:
615
- A data ingestion pipeline which allows adding movie data to an ElasticSearch index via:
7-
1. An AWS Lambda function, explosed via a fuction URL.
8-
2. The Lambda function sends the JSON payload to a Kinesis Data Stream.
9-
3. A Kinesis Firehose Delivery Stream forwards the data to an ElasticSearch domain.
16+
- An AWS Lambda function, explosed via a fuction URL.
17+
- The Lambda function sends the JSON payload to a Kinesis Data Stream.
18+
- A Kinesis Firehose Delivery Stream forwards the data to an ElasticSearch domain.
1019
- A frontend / website which:
1120
- Has a simple search interface to search for movies in the database.
12-
- The HTML page uses a vanilla JS script to query data using a second Lambda function.
21+
- The HTML page uses a plain JS script to query data using a second Lambda function.
1322
- This Lambda function performs a fuzzy query on the movie index in the ElasticSearch cluster.
1423

15-
## System Overview
16-
![System Overview](./docs/overview.drawio.png)
17-
18-
## Setup
19-
1. Clone this repo and `cd` into its working directory
20-
2. Install the following tools:
21-
- [Terraform](https://www.terraform.io/downloads) (v1.4.5)
22-
- [tflocal](https://github.com/localstack/terraform-local)
23-
- [awslocal](https://github.com/localstack/awscli-local)
24-
3. Start LocalStack in the foreground so you can watch the logs:
25-
```
26-
docker compose up
27-
```
28-
4. Open another terminal window and `cd` into the same working directory
29-
5. Create the resource and trigger the invocation of the lambda:
30-
```
31-
./run.sh
32-
```
33-
34-
# TODO:
35-
- This sample does not yet run on AWS
36-
- Firehose -> ElasticSearch
37-
- Records are not properly delivered to ElasticSearch yet
38-
- Search Lambda -> ElasticSearch
39-
- Lambda needs to sign the HTTP requests to ElasticSearch
40-
- Simplify the S3 website URL in LocalStack
41-
- We need to use http://movie-search.s3.amazonaws.com:4566/index.html instead of the generated output: http://movie-search.s3-website-eu-west-1.amazonaws.com/
42-
- It works with http://movie-search.s3-website.localhost.localstack.cloud/
43-
- HTTPS?
44-
- Due to the function URLs having no proper certificate, we can only use the http version!
45-
- http://movie-search.s3-website.localhost.localstack.cloud:4566/
24+
## Architecture Diagram
25+
26+
The following diagram shows the architecture that this sample application builds and deploys:
27+
28+
![System Overview](./images/system_overview.png)
29+
30+
[S3 Website](https://docs.localstack.cloud/tutorials/s3-static-website-terraform/) that holds the website.
31+
[Lambda] (https://docs.localstack.cloud/user-guide/aws/lambda/) for feeding the Kinesis stream and performing the fuzzy-search.
32+
[Kinesis](https://docs.localstack.cloud/user-guide/aws/kinesis/) for forwarding the data into Elasticsearch.
33+
[Firehose](https://docs.localstack.cloud/user-guide/aws/kinesis-firehose/) for forwarding the data into Elasticsearch.
34+
[Elasticsearch](https://docs.localstack.cloud/user-guide/aws/elasticsearch/) which actually holds the data.
35+
36+
## Prerequisites
37+
- LocalStack Pro with the [`localstack` CLI](https://docs.localstack.cloud/getting-started/installation/#localstack-cli).
38+
- [Terraform](https://docs.localstack.cloud/user-guide/integrations/terraform/) with the [`tflocal`](https://github.com/localstack/terraform-local) installed.
39+
- [AWS CLI](https://docs.localstack.cloud/user-guide/integrations/aws-cli/) with the [`awslocal` wrapper](https://docs.localstack.cloud/user-guide/integrations/aws-cli/#localstack-aws-cli-awslocal).
40+
41+
Start LocalStack Pro with the `LOCALSTACK_API_KEY` pre-configured:
42+
43+
```shell
44+
export LOCALSTACK_API_KEY=<your-api-key>
45+
docker compose up -d
46+
```
47+
48+
## Instructions
49+
You can build and deploy the sample application on LocalStack by running `./run.sh`.
50+
Here are instructions to deploy and test it manually step-by-step.
51+
52+
### Build the application
53+
54+
To build the Terraform application, run the following commands:
55+
56+
```bash
57+
terraform init; terraform plan; terraform apply --auto-approve
58+
```
59+
This will create all ressources specified in `main.tf`.
60+
This can take can take a couple of minutes.
61+
Once it is done, you will be able to save the following values into variables by executing these commands
62+
63+
```bash
64+
ingest_function_url=$(terraform output --raw ingest_lambda_url)
65+
elasticsearch_endpoint=$(terraform output --raw elasticsearch_endpoint)
66+
```
67+
68+
### Download the dataset
69+
70+
The dataset we will use for this application is a selection of movies and their typical data such as name, author, genre, etc.
71+
Execute the following commands to make it available.
72+
73+
```bash
74+
temp_dir=$(mktemp --directory)
75+
movie_dataset_url="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/samples/sample-movies.zip"
76+
curl -L $movie_dataset_url > $temp_dir/sample-movies.zip
77+
unzip $temp_dir/sample-movies.zip -d $temp_dir/
78+
```
79+
80+
### Pre-processing the data
81+
82+
For the data to properly work for our streaming use case, we need to remove the bulk insert instruction.
83+
84+
```bash
85+
grep -v '^{ "index"' $temp_dir/sample-movies.bulk > $temp_dir/sample-movies-processed.bulk
86+
mv $temp_dir/sample-movies-processed.bulk $temp_dir/sample-movies.bulk
87+
```
88+
89+
### Populating the database
90+
91+
We know populate the database with the actual entries via our lambda function.
92+
Execute the following code to insert the entries line by line.
93+
It will take quite some time to finish
94+
95+
```bash
96+
cat $temp_dir/sample-movies.bulk | while read line
97+
do
98+
echo -n "."
99+
echo $line | curl -s -X POST $ingest_function_url \
100+
-H 'Content-Type: application/json' \
101+
-d @- > /dev/null
102+
done
103+
```
104+
105+
### Querying the database
106+
107+
Now you can access the website with its entries under http://movie-search.s3-website.localhost.localstack.cloud:4566/ .
108+
If e.g. you search for "Quentis", a misspelling of "Quentin", you should see entries that relate the director "Quentin Tarantino", similar to the following screenshot.
109+
110+
![Screenshot](./images/screenshot.png)
111+
112+
113+
## Known limitations
114+
115+
The localstack logs sometimes show error message in regards to the firehose propagation.
116+
While this might reduce the size of the database to some degree, it is still be sufficient for demonstration purposes.
117+
118+
119+
## Contributing
120+
121+
We appreciate your interest in contributing to our project and are always looking for new ways to improve the developer experience. We welcome feedback, bug reports, and even feature ideas from the community.
122+
Please refer to the [contributing file](CONTRIBUTING.md) for more details on how to get started.

docs/overview.drawio.png

-178 KB
Binary file not shown.
File renamed without changes.

images/system_overview.png

338 KB
Loading

run.sh

+19-11
Original file line numberDiff line numberDiff line change
@@ -6,16 +6,16 @@ shopt -s expand_aliases
66

77
if [ $# -eq 1 ] && [ $1 = "aws" ]; then
88
echo "Deploying on AWS."
9+
alias awslocal='aws'
10+
alias tflocal='terraform'
911
else
1012
echo "Deploying on LocalStack."
11-
alias aws='awslocal'
12-
alias terraform='tflocal'
1313
fi
1414

1515
# Start deployment
16-
terraform init; terraform plan; terraform apply --auto-approve
17-
ingest_function_url=$(terraform output --raw ingest_lambda_url)
18-
elasticsearch_endpoint=$(terraform output --raw elasticsearch_endpoint)
16+
tflocal init; tflocal plan; tflocal apply --auto-approve
17+
ingest_function_url=$(tflocal output --raw ingest_lambda_url)
18+
elasticsearch_endpoint=$(tflocal output --raw elasticsearch_endpoint)
1919

2020
# download the dataset
2121
temp_dir=$(mktemp --directory)
@@ -27,22 +27,22 @@ unzip $temp_dir/sample-movies.zip -d $temp_dir/
2727
# remove the bulk insert instructions (lines starting with index info) from the bulk import file
2828
# (we want to stream the data in there, instead of using the bulk import)
2929
echo "Pre-processing Movie Dataset..."
30-
sed -i '/^{ "index"/d' $temp_dir/sample-movies.bulk
30+
grep -v '^{ "index"' $temp_dir/sample-movies.bulk > $temp_dir/sample-movies-processed.bulk
31+
mv $temp_dir/sample-movies-processed.bulk $temp_dir/sample-movies.bulk
3132

3233
echo "Invoking function for each movie..."
33-
cat $temp_dir/sample-movies.bulk | while read line
34+
while read line
3435
do
3536
echo -n "."
3637
echo $line | curl -s -X POST $ingest_function_url \
3738
-H 'Content-Type: application/json' \
3839
-d @- > /dev/null
39-
done
40+
done < $temp_dir/sample-movies.bulk
4041

4142
echo ""
4243
echo "Testing a search query:"
43-
4444
# Send a sample fuzzy query
45-
curl -X POST $elasticsearch_endpoint/movies/_search -H "Content-Type: application/json" -d \
45+
result=$(curl -X POST $elasticsearch_endpoint/movies/_search -H "Content-Type: application/json" -d \
4646
'{
4747
"query": {
4848
"multi_match": {
@@ -52,4 +52,12 @@ curl -X POST $elasticsearch_endpoint/movies/_search -H "Content-Type: applicatio
5252
"type": "best_fields"
5353
}
5454
}
55-
}' | jq
55+
}')
56+
echo $result | jq
57+
58+
# Rudimentary smoke test
59+
hits=$(echo $result | jq .hits.total.value)
60+
if [[ $hits -lt 1 ]]; then
61+
echo "We have no hits on our query."
62+
exit 1
63+
fi

0 commit comments

Comments
 (0)