|
| 1 | +## Deploying Galileo |
| 2 | + |
| 3 | +Before you start the deployment, make sure: |
| 4 | + |
| 5 | +- Docker is running, with sufficient virtual disk space. |
| 6 | +- Your AWS credentials are set up and available in the shell. |
| 7 | +- You have reviewed the EULA before requesting access to Bedrock models. |
| 8 | + |
| 9 | +### Using the CLI |
| 10 | + |
| 11 | +!!! tip |
| 12 | + We recommend using the CLI for individuals, developer account, trials, and demos. |
| 13 | + |
| 14 | +To deploy Galileo: |
| 15 | + |
| 16 | +1. Open a CLI terminal and navigate to the Galileo directory. |
| 17 | +2. Run these commands: |
| 18 | + |
| 19 | + ``` |
| 20 | + export AWS_REGION={current aws region you're in} |
| 21 | + export AWS_PROFILE=default |
| 22 | + pnpm bootstrap-account |
| 23 | + pnpm galileo-cli deploy |
| 24 | + ``` |
| 25 | +
|
| 26 | + This will display a guided CLI workflow for input. |
| 27 | +
|
| 28 | + !!! note |
| 29 | +
|
| 30 | + If you get a `(node:12100) [EACCES] Error: spawn galileo-cli EACCES` message, ignore it. |
| 31 | +
|
| 32 | + The following options are displayed for selecting a foundation model. |
| 33 | +
|
| 34 | +  |
| 35 | +
|
| 36 | +3. To navigate these prompts: |
| 37 | +
|
| 38 | + - The circle filled green is the currently *selected* option. All other options are *unselected*. |
| 39 | + - The underlined and blue text option is the currently *highlighted* option. |
| 40 | + - Use the keyboard arrows **up** and **down** to move the currently highlighted option. |
| 41 | + - Use the **spacebar** to select/deselect the currently highlighted option. |
| 42 | + - To submit your final answer, press **Enter**. |
| 43 | +
|
| 44 | +4. Select the following options using the CLI: |
| 45 | +
|
| 46 | + - AWS Profile: **default** |
| 47 | + - AWS Region: **(press Enter, the filled in region should be the correct region. If not, fill in the region code)** |
| 48 | + - Administrator email address: **(enter your email address)** |
| 49 | + - Administrator username: **admin** |
| 50 | + - Deploy main application stack?: **Y** |
| 51 | + - Choose the foundation models to suq:qpport: **(unselect all then press enter)** |
| 52 | + - Foundation model region?: **us-west-2** |
| 53 | + - Enable Bedrock?: **Y** |
| 54 | + - Bedrock Region: **us-west-2** |
| 55 | + - Bedrock model ids: **Anthropic Claude (anthropic.claude-v2)** |
| 56 | + - Bedrock endpoint url (optional): **(press Enter, should be blank)** |
| 57 | + - Choose the default foundation model: **bedrock::anthropic.claude-v2** |
| 58 | + - Press Enter for the rest of the prompts. |
| 59 | +
|
| 60 | +Your terminal displays this information: |
| 61 | +
|
| 62 | +``` |
| 63 | + ____ __ _ _ _ _ _ |
| 64 | + / __ \ __ _ __ __ ___ / /__ _ __ _ | |(_)| | ___ ___ ___ | |(_) |
| 65 | + / / _` | / _` |\ \ /\ / // __| / // _` | / _` || || || | / _ \ / _ \ _____ / __|| || | |
| 66 | + | | (_| || (_| | \ V V / \__ \ / /| (_| || (_| || || || || __/| (_) ||_____|| (__ | || | |
| 67 | + \ \__,_| \__,_| \_/\_/ |___//_/ \__, | \__,_||_||_||_| \___| \___/ \___||_||_| |
| 68 | + \____/ |___/ |
| 69 | +✔ Config file name? … config.json |
| 70 | +✔ Application Name (stack/resource naming) … Galileo |
| 71 | +✔ AWS Profile … default |
| 72 | +✔ AWS Region (app) … us-west-2 |
| 73 | +✔ Administrator email address Enter email address to automatically create Cognito admin user, otherwise leave blank |
| 74 | + … someone@somewhere.com |
| 75 | +✔ Administrator username … yourusername |
| 76 | +✔ Choose the foundation models to support › |
| 77 | +✔ Foundation model region? … us-west-2 |
| 78 | +✔ Enable Bedrock? … yes |
| 79 | +✔ Bedrock region … us-west-2 |
| 80 | +✔ Loading available Bedrock models |
| 81 | +✔ Bedrock model ids › Anthropic Claude (anthropic.claude-v2) |
| 82 | +✔ Bedrock endpoint url (optional) … |
| 83 | +✔ Choose the default foundation model › bedrock::anthropic.claude-v2 |
| 84 | +✔ Embedding model |
| 85 | +Enter the model id to use for embeddings, supports any AutoML model |
| 86 | + |
| 87 | +Example: sentence-transformers/all-mpnet-base-v2, intfloat/multilingual-e5-large, sentence-transformers/all-MiniLM-L6-v2 |
| 88 | + … sentence-transformers/all-mpnet-base-v2 |
| 89 | +✔ Embedding Vector Size |
| 90 | +Enter the vector size for the chosen embedding model |
| 91 | + … 768 |
| 92 | +✔ Embedding model instance type |
| 93 | +Enable autoscaling the embedding instance capacity based |
| 94 | + |
| 95 | +Recommend "ml.g4dn.xlarge" for smaller datasets, and "ml.g4dn.2xlarge" for larger datasets |
| 96 | + … ml.g4dn.xlarge |
| 97 | +✔ Embedding model max capacity (autoscaling) |
| 98 | +Enable autoscaling the embedding instance capacity based |
| 99 | + |
| 100 | +Ensure adequate Service Quota limit for SageMaker > "ml.g4dn.xlarge for endpoint usage" |
| 101 | + … 1 |
| 102 | +✔ Indexing Pipeline instance type |
| 103 | +Instance type used for processing dataset files and indexing to vector store |
| 104 | + … ml.t3.large |
| 105 | +✔ Indexing Pipeline max containers |
| 106 | +Number of containers used for indexing files to vector store |
| 107 | + |
| 108 | +Ensure adequate Service Quota limit for SageMaker > "ml.t3.large for processing job" |
| 109 | + … 5 |
| 110 | +✔ Create vector store "index"? |
| 111 | +If enabled, will create a database index for the data to improve search over large datasets |
| 112 | + |
| 113 | +Recommended for very large datasets |
| 114 | + … no |
| 115 | +✔ Deploy sample dataset? › |
| 116 | +✔ Enable tooling in dev stage (SageMaker Studio, PgAdmin)? › |
| 117 | +Synthesizing project repository... |
| 118 | +? [CDK DEPLOY] Execute the following command in 615092085770? |
| 119 | +cdk deploy --require-approval never --region us-west-2 --profile default -c "configPath=config.json" Dev/Galileo |
| 120 | + … yes |
| 121 | +``` |
| 122 | +
|
| 123 | +!!! info |
| 124 | +
|
| 125 | + It takes about 40 minutes to build and deploy everything. While we wait, continue to the next page to have a look at how this project was built and how to extend it. |
| 126 | +
|
| 127 | +#### Updating configuration settings |
| 128 | +
|
| 129 | +The CLI will generate an application configuration file in demo/infra/config.json, which will persist your configuration. You can modify this file and redeploy to change the configuration, or use the CLI. |
| 130 | +
|
| 131 | +`pnpm run galileo-cli --help` for cli help info |
| 132 | +
|
| 133 | +For more details on CLI operations, refer to the [CLI page](../../developer-guide/cli). |
| 134 | +
|
| 135 | +!!! info "Cross-Region deployments" |
| 136 | + Galileo CLI allows you to deploy your LLM stack and application stack into different Regions. |
| 137 | +
|
| 138 | +### Using a CI/CD pipeline |
| 139 | +
|
| 140 | +!!! tip |
| 141 | + We recommend using the CI/CD pipeline deployment method for live services and for shared team accounts. |
| 142 | +
|
| 143 | +**Note**: Make sure your AWS credentials in your shell are correct. |
| 144 | +
|
| 145 | +1. Create a CodeCommit repository in your target account/Region name "galileo". |
| 146 | +2. Push this git repository to the `mainline` branch |
| 147 | +3. Run `pnpm run deploy:pipeline` |
| 148 | +
|
| 149 | +### Deploying manually |
| 150 | +
|
| 151 | +!!! tip |
| 152 | + We recommend using a manual deployment method only if you need to have full control and want to modify the application. |
| 153 | +
|
| 154 | +```sh |
| 155 | +pnpm install |
| 156 | +pnpm build |
| 157 | +
|
| 158 | +cd demo/infra |
| 159 | +pnpm exec cdk deploy --app cdk.out --require-approval never Dev/Galileo |
| 160 | +pnpm exec cdk deploy --app cdk.out --require-approval never Dev/Galileo-SampleDataset # (optional) |
| 161 | +``` |
| 162 | + |
| 163 | +## What is deployed? |
| 164 | + |
| 165 | +As part of the deployment, the following services are deployed in your AWS account: |
| 166 | + |
| 167 | +- A pre-built conversational UI that enables contextual conversation with memory, |
| 168 | +- An optimized embeddings vector store based on RDS Postgres and `pgvector`, |
| 169 | +- A scalable and elastic data ingestion pipeline, |
| 170 | +- A low latency text embeddings inference engine, |
| 171 | +- Retrieval augmented generation (RAG) features, and |
| 172 | +- A choice of open source large language models. |
| 173 | + |
| 174 | + |
| 175 | + |
| 176 | +## Next steps |
| 177 | + |
| 178 | +- [Validate the deployment](validate-deployment.md) |
| 179 | +- [Set up Cloud9 as your development environment](cloud9-ide.md) |
0 commit comments