Skip to content

Commit 26948bc

Browse files
committed
Initial commit
1 parent 97717bf commit 26948bc

28 files changed

+2526
-9
lines changed

.DS_Store

8 KB
Binary file not shown.

README.md

Lines changed: 155 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,163 @@
1-
## My Project
21

3-
TODO: Fill this README out!
2+
# Welcome to the Amazon Neptune Scooters demo!
43

5-
Be sure to:
4+
Welcome to our tutorial "Implementing a Graph database for a Scooters Business on AWS". Throughout this session, we'll delve into the fascinating realm of Graph Databases and Generative AI oriented to Graphs, and sometimes comparing these technologies with traditional relational systems or RDBMS. Given the widespread use and understanding of RDBMS, we believe comparing these two systems will provide a clearer perspective for those trying to grasp the concepts of graph databases.
65

7-
* Change the title in this README
8-
* Edit your repository description on GitHub
6+
## 📋 Table of content
97

10-
## Security
8+
- [Description](#-description)
9+
- [Use cases](#-use-cases)
10+
- [Pre-requisites](#-pre-requisites)
11+
- [Installing](#-installing)
12+
- [Architecture](#-architecture)
13+
- [Cleanup](#-cleanup)
1114

12-
See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.
15+
## 🔰 Description
1316

14-
## License
17+
By the end of this tutorial, you will:
1518

16-
This library is licensed under the MIT-0 License. See the LICENSE file.
19+
* Understand the fundamentals of Graph Databases; i.e. main differences between graph and relational DBs.
20+
* Gain insights into the unique advantages and challenges, offered by graph databases.
21+
* Learn about Amazon Neptune service, tailored for graph database deployments.
22+
* Learn how to use Generative AI, to help you in coding and abstracting Gremlin query language with natural language.
23+
* Have your own customisable Graph Data Generator.
24+
* Appreciate scenarios, where graph databases outshine their relational counterparts.
25+
* Get hands-on experience, with setting up, loading and querying a graph database on AWS.
26+
* Build most of the tutorial using Infrastructure-as-Code (IaC) Amazon CDK
27+
* For those with relational databases experience, this exploration will illuminate new possibilities and data solutions. For newcomers, you're about to dive into a dynamic way of visualizing and interpreting data.
1728

29+
## 🛠 Use cases
30+
31+
- Comparison of technology applicability: <i>"use the right tool for the right job"</i>.
32+
- Analysis of Performance and TCO; i.e. Relational Database vs. Graph Database.
33+
- Deploy a Graph Data Generator, completely customizable for any use case to build.
34+
- Understand how to use a [Large Language Model](https://aws.amazon.com/what-is/large-language-model/), to interrogate Graph database.
35+
36+
## 🎒 Pre-requisites
37+
38+
- [Docker](https://www.docker.com/): Install and run Docker locally. This tool uses docker to build image and run containers.
39+
- Minimum disk space of 2 GB for building and deploying docker image
40+
- Install [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)
41+
- Install Python 3.9+
42+
- Install [Node.js](https://nodejs.org/en/)
43+
- After installing Node.js (```npm``` in path), install [Amazon CDK](https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html); e.g. ```npm install -g aws-cdk```
44+
- Install Visual Studio Code, with Amazon Code [Whisperer Plugin](https://youtu.be/rHNMfOK8pWI)
45+
46+
## 🚀 Installing
47+
48+
This project is set up like a standard Python project. The initialization
49+
process also creates a virtualenv within this project, stored under the `.venv`
50+
directory. To create the virtualenv it assumes that there is a `python3`
51+
(or `python` for Windows) executable in your path with access to the `venv`
52+
package. If for any reason the automatic creation of the virtualenv fails,
53+
you can create the virtualenv manually.
54+
55+
To manually create a virtualenv on MacOS and Linux:
56+
57+
```
58+
$ python3 -m venv .venv
59+
```
60+
61+
After the init process completes and the virtualenv is created, you can use the following
62+
step to activate your virtualenv.
63+
64+
```
65+
$ source .venv/bin/activate
66+
```
67+
68+
If you are a Windows platform, you would activate the virtualenv like this:
69+
70+
```
71+
% .venv\Scripts\activate.bat
72+
```
73+
74+
Once the virtualenv is activated, you can install the required dependencies. Optionally use ```--upgrade```
75+
76+
```
77+
$ pip install -r requirements.txt
78+
```
79+
80+
Add/change your own environment to the cdk.json file, at the 'context' key. For example, if you want to add your 'Production' environment:
81+
```json
82+
...
83+
"context": {
84+
"environments": {
85+
"production": {
86+
"vpc_neptune": "",
87+
"s3_prefix_scooters_data_loc":"scooters-graph-demo/neptune/data",
88+
"lambda_datagen_num_vehicles":"1000",
89+
"lambda_datagen_num_parts":"10",
90+
"api_gtw_ip_addr_whitelist_list":""
91+
}
92+
},
93+
...
94+
}
95+
```
96+
97+
⚠️ Important: to create a safer deployment for this demo, you need to add/leave one environment (i.e. even if the optional keys have empty values, like the ones above):
98+
99+
- <b>vpc_neptune</b> [optional]: if you want to deploy all the assets in your VPC, instead of creating a new one, you can change it here.
100+
- <b>s3_prefix_scooters_data_loc</b>: to change the path (S3 Key), after the new S3 bucket name.
101+
- <b>lambda_datagen_num_vehicles</b>: number of scooters (graph nodes) to create in the dataset
102+
- <b>lambda_datagen_num_parts</b>: number of parts (graph nodes) to add per scooter.
103+
- <b>api_gtw_ip_addr_whitelist_list</b> [optional]: list of IPs or CIDR to be whitelisted in the API Gateway.
104+
105+
106+
You can now synthesize your CDK stacks:
107+
108+
💡 Tips:
109+
- Remember to have Docker running at this point!
110+
- If you received an error like —app is required..., it's probably because you are running the command from a subdirectory. Navigate to the main app directory and try again.
111+
112+
If this is your first time using Amazon CDK in this account, we need to bootstrap it first:
113+
114+
```
115+
$ cdk bootstrap --profile profile-aws-dev-sandbox
116+
```
117+
118+
Synthesize all your stacks:
119+
120+
```
121+
$ cdk synth --all --profile profile-aws-dev-sandbox
122+
```
123+
124+
If the previous steps succeeded, then we can deploy our entire project:
125+
126+
💡 Tip: this deployment can take more than 15 minutes, especially if it's the first time.
127+
128+
```
129+
$ cdk deploy --all --profile profile-aws-dev-sandbox
130+
```
131+
132+
If you don't want to be asked by CDK:
133+
```
134+
$ cdk deploy --profile profile-aws-dev-sandbox --require-approval never --all
135+
```
136+
137+
Once you have deployed the Amazon CDK project successfully, you can carry on with the steps provided in the blog post and YouTube video series.
138+
139+
#### Useful commands
140+
141+
* `cdk ls` list all stacks in the app
142+
* `cdk synth` emits the synthesized CloudFormation template
143+
* `cdk deploy` deploy this stack to your default AWS account/region
144+
* `cdk diff` compare deployed stack with current state
145+
* `cdk docs` open CDK documentation
146+
147+
## Architecture
148+
149+
![](assets/architecture.drawio.png)
150+
151+
### Adapt the graph to your own use case
152+
The graph data model uses Any Python Tree Data, to deploy the Vehicle hierarchy. You can modify this hierarchy graph model via the Lambda function, within the Data Generation stack.
153+
154+
### Data model
155+
156+
![](assets/scooters_graph_model.drawio.png)
157+
158+
## Cleanup
159+
1. Via AWS CLI or the AWS console, empty the S3 bucket created by our CDK stack; e.g. s3://scooterss3stack-scootersdemoXXXX/. Otherwise, our CDK Removal Policy will not be able to delete the bucket.
160+
2. Run the command below, to delete all resources deployed by our CDK project (architecture image above). This will ask if you want to delete those stacks; enter Y.
161+
```
162+
$ cdk destroy --all --profile profile-aws-dev-sandbox
163+
```

app.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
import os
2+
import aws_cdk as cdk
3+
from aws_cdk import Environment, Tags
4+
from stack_params_config.stack_ssm_config import SsmParametersStack
5+
from stack_lambda_datagen.lambda_datagen_stack import ScootersDataStack
6+
from stack_s3.s3_stack import S3Stack
7+
from stack_vpc_neptune.vpc_neptune_stack import VpcNeptuneStack
8+
9+
10+
# AWS Settings
11+
app = cdk.App()
12+
env_aws_settings = Environment(account=os.environ['CDK_DEFAULT_ACCOUNT'], region=os.environ['CDK_DEFAULT_REGION'])
13+
14+
# Choose environment to deploy; see cdk.json file. CLI: cdk deploy --context <<env-production>>
15+
env_context_params = app.node.try_get_context("env-production")
16+
17+
# SSM Parameters. Here, you can save above's DICT, instead of hard-coding.
18+
ssm_stack = SsmParametersStack(app, "ScootersSsmParametersStack",
19+
input_metadata=env_context_params,
20+
env=env_aws_settings
21+
)
22+
23+
# S3 stack to create bucket
24+
stack_s3 = S3Stack(app, "ScootersS3Stack",
25+
input_metadata=env_context_params,
26+
env=env_aws_settings
27+
)
28+
29+
# Lambda stack to create Graph data generator
30+
stack_lambda_datagen = ScootersDataStack(app, "ScootersDataStack",
31+
input_metadata=env_context_params,
32+
env=env_aws_settings
33+
)
34+
35+
# Neptune cluster stack;
36+
stack_vpc_neptune = VpcNeptuneStack(app, "ScootersNeptuneStack",
37+
input_metadata=env_context_params,
38+
env=env_aws_settings
39+
)
40+
41+
# Stack dependencies (i.e. both, dataGen and Neptune cluster, need the S3 bucket to grant RW privs)
42+
stack_lambda_datagen.add_dependency(stack_s3)
43+
stack_vpc_neptune.add_dependency(stack_s3)
44+
45+
# Tagging all stacks:
46+
Tags.of(stack_s3).add("project", "scooters-demo/stack-s3")
47+
Tags.of(stack_lambda_datagen).add("project", "scooters-demo/stack-lambda-datagen")
48+
Tags.of(stack_vpc_neptune).add("project", "scooters-demo/stack-vpc-neptune")
49+
50+
51+
app.synth()

assets/architecture.drawio.png

359 KB
Loading
197 KB
Loading

cdk.context.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"availability-zones:account=411292098857:region=us-west-2": [
3+
"us-west-2a",
4+
"us-west-2b",
5+
"us-west-2c",
6+
"us-west-2d"
7+
]
8+
}

cdk.json

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
{
2+
"app": "python3 app.py",
3+
"watch": {
4+
"include": [
5+
"**"
6+
],
7+
"exclude": [
8+
"README.md",
9+
"cdk*.json",
10+
"requirements*.txt",
11+
"source.bat",
12+
"**/__init__.py",
13+
"python/__pycache__",
14+
"tests"
15+
]
16+
},
17+
"context": {
18+
"env-production": {
19+
"vpc_neptune": "",
20+
"s3_prefix_scooters_data_loc":"scooters-graph-demo/neptune/data",
21+
"lambda_datagen_num_vehicles":"1000",
22+
"lambda_datagen_num_parts":"10",
23+
"api_gtw_ip_addr_whitelist_list":""
24+
},
25+
"@aws-cdk/aws-lambda:recognizeLayerVersion": true,
26+
"@aws-cdk/core:checkSecretUsage": true,
27+
"@aws-cdk/core:target-partitions": [
28+
"aws",
29+
"aws-cn"
30+
],
31+
"@aws-cdk-containers/ecs-service-extensions:enableDefaultLogDriver": true,
32+
"@aws-cdk/aws-ec2:uniqueImdsv2TemplateName": true,
33+
"@aws-cdk/aws-ecs:arnFormatIncludesClusterName": true,
34+
"@aws-cdk/aws-iam:minimizePolicies": true,
35+
"@aws-cdk/core:validateSnapshotRemovalPolicy": true,
36+
"@aws-cdk/aws-codepipeline:crossAccountKeyAliasStackSafeResourceName": true,
37+
"@aws-cdk/aws-s3:createDefaultLoggingPolicy": true,
38+
"@aws-cdk/aws-sns-subscriptions:restrictSqsDescryption": true,
39+
"@aws-cdk/aws-apigateway:disableCloudWatchRole": true,
40+
"@aws-cdk/core:enablePartitionLiterals": true,
41+
"@aws-cdk/aws-events:eventsTargetQueueSameAccount": true,
42+
"@aws-cdk/aws-iam:standardizedServicePrincipals": true,
43+
"@aws-cdk/aws-ecs:disableExplicitDeploymentControllerForCircuitBreaker": true,
44+
"@aws-cdk/aws-iam:importedRoleStackSafeDefaultPolicyName": true,
45+
"@aws-cdk/aws-s3:serverAccessLogsUseBucketPolicy": true,
46+
"@aws-cdk/aws-route53-patters:useCertificate": true,
47+
"@aws-cdk/customresources:installLatestAwsSdkDefault": false
48+
}
49+
}

0 commit comments

Comments
 (0)