|
1 |
| -## My Project |
2 | 1 |
|
3 |
| -TODO: Fill this README out! |
| 2 | +# Welcome to the Amazon Neptune Scooters demo! |
4 | 3 |
|
5 |
| -Be sure to: |
| 4 | +Welcome to our tutorial "Implementing a Graph database for a Scooters Business on AWS". Throughout this session, we'll delve into the fascinating realm of Graph Databases and Generative AI oriented to Graphs, and sometimes comparing these technologies with traditional relational systems or RDBMS. Given the widespread use and understanding of RDBMS, we believe comparing these two systems will provide a clearer perspective for those trying to grasp the concepts of graph databases. |
6 | 5 |
|
7 |
| -* Change the title in this README |
8 |
| -* Edit your repository description on GitHub |
| 6 | +## 📋 Table of content |
9 | 7 |
|
10 |
| -## Security |
| 8 | +- [Description](#-description) |
| 9 | +- [Use cases](#-use-cases) |
| 10 | +- [Pre-requisites](#-pre-requisites) |
| 11 | +- [Installing](#-installing) |
| 12 | +- [Architecture](#-architecture) |
| 13 | +- [Cleanup](#-cleanup) |
11 | 14 |
|
12 |
| -See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information. |
| 15 | +## 🔰 Description |
13 | 16 |
|
14 |
| -## License |
| 17 | +By the end of this tutorial, you will: |
15 | 18 |
|
16 |
| -This library is licensed under the MIT-0 License. See the LICENSE file. |
| 19 | +* Understand the fundamentals of Graph Databases; i.e. main differences between graph and relational DBs. |
| 20 | +* Gain insights into the unique advantages and challenges, offered by graph databases. |
| 21 | +* Learn about Amazon Neptune service, tailored for graph database deployments. |
| 22 | +* Learn how to use Generative AI, to help you in coding and abstracting Gremlin query language with natural language. |
| 23 | +* Have your own customisable Graph Data Generator. |
| 24 | +* Appreciate scenarios, where graph databases outshine their relational counterparts. |
| 25 | +* Get hands-on experience, with setting up, loading and querying a graph database on AWS. |
| 26 | +* Build most of the tutorial using Infrastructure-as-Code (IaC) Amazon CDK |
| 27 | +* For those with relational databases experience, this exploration will illuminate new possibilities and data solutions. For newcomers, you're about to dive into a dynamic way of visualizing and interpreting data. |
17 | 28 |
|
| 29 | +## 🛠 Use cases |
| 30 | + |
| 31 | +- Comparison of technology applicability: <i>"use the right tool for the right job"</i>. |
| 32 | +- Analysis of Performance and TCO; i.e. Relational Database vs. Graph Database. |
| 33 | +- Deploy a Graph Data Generator, completely customizable for any use case to build. |
| 34 | +- Understand how to use a [Large Language Model](https://aws.amazon.com/what-is/large-language-model/), to interrogate Graph database. |
| 35 | + |
| 36 | +## 🎒 Pre-requisites |
| 37 | + |
| 38 | +- [Docker](https://www.docker.com/): Install and run Docker locally. This tool uses docker to build image and run containers. |
| 39 | +- Minimum disk space of 2 GB for building and deploying docker image |
| 40 | +- Install [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) |
| 41 | +- Install Python 3.9+ |
| 42 | +- Install [Node.js](https://nodejs.org/en/) |
| 43 | +- After installing Node.js (```npm``` in path), install [Amazon CDK](https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html); e.g. ```npm install -g aws-cdk``` |
| 44 | +- Install Visual Studio Code, with Amazon Code [Whisperer Plugin](https://youtu.be/rHNMfOK8pWI) |
| 45 | + |
| 46 | +## 🚀 Installing |
| 47 | + |
| 48 | +This project is set up like a standard Python project. The initialization |
| 49 | +process also creates a virtualenv within this project, stored under the `.venv` |
| 50 | +directory. To create the virtualenv it assumes that there is a `python3` |
| 51 | +(or `python` for Windows) executable in your path with access to the `venv` |
| 52 | +package. If for any reason the automatic creation of the virtualenv fails, |
| 53 | +you can create the virtualenv manually. |
| 54 | + |
| 55 | +To manually create a virtualenv on MacOS and Linux: |
| 56 | + |
| 57 | +``` |
| 58 | +$ python3 -m venv .venv |
| 59 | +``` |
| 60 | + |
| 61 | +After the init process completes and the virtualenv is created, you can use the following |
| 62 | +step to activate your virtualenv. |
| 63 | + |
| 64 | +``` |
| 65 | +$ source .venv/bin/activate |
| 66 | +``` |
| 67 | + |
| 68 | +If you are a Windows platform, you would activate the virtualenv like this: |
| 69 | + |
| 70 | +``` |
| 71 | +% .venv\Scripts\activate.bat |
| 72 | +``` |
| 73 | + |
| 74 | +Once the virtualenv is activated, you can install the required dependencies. Optionally use ```--upgrade``` |
| 75 | + |
| 76 | +``` |
| 77 | +$ pip install -r requirements.txt |
| 78 | +``` |
| 79 | + |
| 80 | +Add/change your own environment to the cdk.json file, at the 'context' key. For example, if you want to add your 'Production' environment: |
| 81 | +```json |
| 82 | +... |
| 83 | +"context": { |
| 84 | +"environments": { |
| 85 | + "production": { |
| 86 | + "vpc_neptune": "", |
| 87 | + "s3_prefix_scooters_data_loc":"scooters-graph-demo/neptune/data", |
| 88 | + "lambda_datagen_num_vehicles":"1000", |
| 89 | + "lambda_datagen_num_parts":"10", |
| 90 | + "api_gtw_ip_addr_whitelist_list":"" |
| 91 | + } |
| 92 | +}, |
| 93 | +... |
| 94 | +} |
| 95 | +``` |
| 96 | + |
| 97 | +⚠️ Important: to create a safer deployment for this demo, you need to add/leave one environment (i.e. even if the optional keys have empty values, like the ones above): |
| 98 | + |
| 99 | +- <b>vpc_neptune</b> [optional]: if you want to deploy all the assets in your VPC, instead of creating a new one, you can change it here. |
| 100 | +- <b>s3_prefix_scooters_data_loc</b>: to change the path (S3 Key), after the new S3 bucket name. |
| 101 | +- <b>lambda_datagen_num_vehicles</b>: number of scooters (graph nodes) to create in the dataset |
| 102 | +- <b>lambda_datagen_num_parts</b>: number of parts (graph nodes) to add per scooter. |
| 103 | +- <b>api_gtw_ip_addr_whitelist_list</b> [optional]: list of IPs or CIDR to be whitelisted in the API Gateway. |
| 104 | + |
| 105 | + |
| 106 | +You can now synthesize your CDK stacks: |
| 107 | + |
| 108 | +💡 Tips: |
| 109 | +- Remember to have Docker running at this point! |
| 110 | +- If you received an error like —app is required..., it's probably because you are running the command from a subdirectory. Navigate to the main app directory and try again. |
| 111 | + |
| 112 | +If this is your first time using Amazon CDK in this account, we need to bootstrap it first: |
| 113 | + |
| 114 | +``` |
| 115 | +$ cdk bootstrap --profile profile-aws-dev-sandbox |
| 116 | +``` |
| 117 | + |
| 118 | +Synthesize all your stacks: |
| 119 | + |
| 120 | +``` |
| 121 | +$ cdk synth --all --profile profile-aws-dev-sandbox |
| 122 | +``` |
| 123 | + |
| 124 | +If the previous steps succeeded, then we can deploy our entire project: |
| 125 | + |
| 126 | +💡 Tip: this deployment can take more than 15 minutes, especially if it's the first time. |
| 127 | + |
| 128 | +``` |
| 129 | +$ cdk deploy --all --profile profile-aws-dev-sandbox |
| 130 | +``` |
| 131 | + |
| 132 | +If you don't want to be asked by CDK: |
| 133 | +``` |
| 134 | +$ cdk deploy --profile profile-aws-dev-sandbox --require-approval never --all |
| 135 | +``` |
| 136 | + |
| 137 | +Once you have deployed the Amazon CDK project successfully, you can carry on with the steps provided in the blog post and YouTube video series. |
| 138 | + |
| 139 | +#### Useful commands |
| 140 | + |
| 141 | + * `cdk ls` list all stacks in the app |
| 142 | + * `cdk synth` emits the synthesized CloudFormation template |
| 143 | + * `cdk deploy` deploy this stack to your default AWS account/region |
| 144 | + * `cdk diff` compare deployed stack with current state |
| 145 | + * `cdk docs` open CDK documentation |
| 146 | + |
| 147 | +## Architecture |
| 148 | + |
| 149 | + |
| 150 | + |
| 151 | +### Adapt the graph to your own use case |
| 152 | +The graph data model uses Any Python Tree Data, to deploy the Vehicle hierarchy. You can modify this hierarchy graph model via the Lambda function, within the Data Generation stack. |
| 153 | + |
| 154 | +### Data model |
| 155 | + |
| 156 | + |
| 157 | + |
| 158 | +## Cleanup |
| 159 | +1. Via AWS CLI or the AWS console, empty the S3 bucket created by our CDK stack; e.g. s3://scooterss3stack-scootersdemoXXXX/. Otherwise, our CDK Removal Policy will not be able to delete the bucket. |
| 160 | +2. Run the command below, to delete all resources deployed by our CDK project (architecture image above). This will ask if you want to delete those stacks; enter Y. |
| 161 | +``` |
| 162 | +$ cdk destroy --all --profile profile-aws-dev-sandbox |
| 163 | +``` |
0 commit comments