Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Elasticsearch integration #421

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docker-compose.dss.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ services:
POSTGRES_DB: ttasmarthub
DATABASE_URL: postgres://postgres:secretpass@postgres_docker:5432/ttasmarthub
SESSION_SECRET: notasecret
ELASTICSEARCH_NODE: http://elasticsearch:9200
volumes:
- ".:/app:rw"
db:
Expand Down
6 changes: 6 additions & 0 deletions docker-compose.override.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,12 @@ services:
depends_on:
- db
- redis
- elasticsearch
environment:
- POSTGRES_HOST=postgres_docker
- REDIS_HOST=redis
- SMTP_HOST=mailcatcher
- ELASTICSEARCH_NODE=http://elasticsearch:9200
volumes:
- ".:/app:rw"
frontend:
Expand All @@ -37,9 +39,13 @@ services:
depends_on:
- db
- redis
- elasticsearch
environment:
- POSTGRES_HOST=postgres_docker
- REDIS_HOST=redis
- SMTP_HOST=mailcatcher
- ELASTICSEARCH_NODE=http://elasticsearch:9200
- ELASTICSEARCH_RECREATE_INDICES_THIS_MEANS_I_WANT_TO_REINDEX_EVERYTHING=true
- LOG_LEVEL=debug
volumes:
- ".:/app:rw"
9 changes: 9 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,15 @@ services:
- "9443:9443"
environment:
- MAX_FILE_SIZE=30M
elasticsearch:
build: elasticsearch
environment:
discovery.type: "single-node"
# These http.cors.* envs are related to running ES browsing UIs locally. They should not be used in production
http.cors.enabled: "true"
http.cors.allow-origin: "http://localhost:8081"
ports:
- "9200:9200"
redis:
image: redis:5.0.6-alpine
command: ['redis-server', '--requirepass', '$REDIS_PASS']
Expand Down
7 changes: 5 additions & 2 deletions docs/boundary_diagram.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
System Boundary Diagram
=======================

![rendered boundary diagram](http://www.plantuml.com/plantuml/png/dLPHRnit37xFh-3oKAH04sDfNpOCGvtOJJCagvjpsvS2WQoZRIqwIHKfkyRG_pvHELzySJCKVKYiayIF7qNoaruJgyYfiGU_6ATjgX6Mp85-7moYMfyi32_JB156xHsZunFjcu6ger5kPLD8W_DNnnEh77-ulXfOnKYSfsxFlMDb7CQJ8DXis29CfqEe6XKPtgOPp7nojOeRM1bS7qmmadT7eVmuj2_Wy67h1y9uc6U63j3Lnq87_1nu-GZpyFaMfyOLMf_HUZYZxUnXRtgLGNs4peP0kHekfjSPxwnbUQJM90m-LN3XL-VMf_hmEVnx0D3jq3AvyAkyhGnFyhZT0r1jYB6v7NzbGRfLThQx3QnNzV5CMULqOLSS3Q_ECeE-TUVbLWNJWnXEBlzdT_I9osdyySog3KRw4nvCxnT9_9R8u8t453V0KStdHWDXN1cDBVx3cR2V2LUj8zQ65HllkjKkT82k6exMwtKc7dey-cNn2MSm3C4QNU24qz--nh-g5p3-6Y9IJiCDAisoCZ8KCinzmhIt5ZLJ0QmLgnEuJfPDi0Z64SlP4iJad76B7CeUDn_lCBRq7f4gIq_nEalMolS4uzp71cPAtosY70C8PY4dV214hv7e--wrLOsIudbZB09fLNYMj9Qyg6RuKKQNZZ4wPK4zi87izy6az4SAKbEqxsWqArcePyBO4oyYLk-lOAHOpHJXjlXkHfKCMpE2jl_RatPpiqYnyztFuPC-wMKBNhs4MT9to8VTH6b9zX49I2f9mrpEGH4XFns5r9tqvOtyxiKmjeeoajBBClK-OKBmyzq4_UDd1Epl4-LTw-ZRWVnzExZlfAEtrRZAzl9JQ1gti0YLMJXuvpwxwpT3M1a5KzFWyPaYhALIm4UON7u4hMBzEU-gV8gwkB-8oZ5YeRihCGjIjwGKIGrWVJEydXoh4A9VPS1AgQfqLA-z7QuZMgEb674DDpBUPisuv1F1jNU6VImrboFAMtR5SGVWMZcrG0ZtHIpD0i_0-iKYcmfckHRWOZmTJDvL-T-7wM0qlXfFnt3qwjNBvnbTiDi6xo5733uo0QF51Zbk18vrw1dkAF-3G3Q5KzVGFqR3UKQ8Ph5ay0wiSNS7Gsj12xgMYYMeCHZDM8EA2EXqt0JvMXjUmlIgWbzFJY4vYqJWUObnRGX1SEkJ0tHr6GhhhgftxwtQhvtqNbl0p4el6tHCFVgzuBk8z_KwHF_LY6xXbuJ4zEEfsCFgSlFpe3sHx1HBb8TAk3j8WW97lprXsxPKH8u6LzZrn5wQzwjhnrmbNgAJ5cUaedjY-swOfq_twwerMc4qymsOMYnvinzxv_BCz9nsiXbHrJHOHoDUXJ4XGqJjOqq25-YBxSoxg7yMeKz0gctVV2lFOhiA6j3LmHl7-on-5cQFo5PwqNqHNIMHKWWggWQtZucovIaA7K2DtmM0Pv129jLU2Ac7s6bcgsNINrljFBySd8zWOW9Qut1VDCkRLf9mOLXkAr7sjzMnX1wzjIZtMuqWgaM6lRna-W3QH6rH6_5LXkUNZqHNxkX5UprMtTjfJtqx9fCqiuQMnR50vGMW_7rqUjgMViTwrxbl6iEmzFvE5CdCfQMtZ80znEJW_ZH2Tne5x07aqNDoXC7wJUzfuHrGqXg9vicsS1au8rsfpUteaN3VpzwqLsQPU8skJetz3m00)
![rendered boundary diagram](http://www.plantuml.com/plantuml/png/dLPVRnit37_Ff-3oKAH04sDfNpOCGvtOJJCagvjpsvS2WIpHjXQTf8gI7MFeTvyedo--NBCM1HAVJaV--AF4FtnDKOIOQdFmYxRI98MmZD7JxuD14EkJXOxBD4k4GJeRqSOJwUg1K44BkpB9fy7vg-E94JP-kBuQ616HOjwwFS_IGbiC9vwneuu6OJeTGbLdS_0spM3cabKYR62bSNqmmK3E7ldoi6XUmE63w0V2S9XtXXRGgcEc0suESVa8y_3v3aVX2etBQ1sSqMxtiBEzog0kGAj38Ao2Y-brnZjhM1w9DJ63JnLSkDMvSKaTFeT_3m1Q2ScR2xVAOcLvajUD342reOZrxVW-CjAjCBDzCJ1EgEEPCCBApAunDBmwomowsgsLM-5r1s8ukVmVwU1IBAVnnpCZDXZeJtXGtNuajbiyWpViKzX0368-rn88mqfBRF0VpYIw6NELCg3IKKJjRFkrTHsWZOQZzNgTYViEZsvOVC8P3BqtMXOuvlItK_XNzGB6zpuWvMkuGI4ZVnbfyXbcUs7I6OCoKW5K8eeJk4ucQcyGjAN8PqiGahNQBh8hFMukt65ew3qYLJuznUii6SFl4mnp5sgO2Nct27mCm9XChl017vn4ec-RbqRKWeFdZJO894NXMQwrt8keXnzvTMaZ1YiCY0SsaEq-g9JyI9vZYlg6HhDo2sh6Z1t5HS3g_HMCCi6P8Dm5tyseucMZHj3aVtFITguNARIsUsdOjub6KDIIK0IvpC3Z_VUTqTvcQrvaetu3RjuIIOIgsDvAi-FVPy6Uqyl6MlKIPadUOJniDtXLHHUGW3YU9SpvDWGHngVFagEVgyCDFtidsJWbvB9ilK-Oa1tP2vD_t8mWkcsuEDiO-dh0_ryFjlbyzBOQLrcV7uLczjWWL7fXuPtzRA_V366b1KnrZSTdWb1nl88BCBdy2584zJbBLIu5MSnV164ZOg6xN-WM96v92E98H7Kpl9uS2--3MsN0IeYgv7Mdo5XS1pHQSahW6cvulRiyRPDB1DVEwkX2YVpiKHRPBz5x0DV2YWM2twOIfomy0-CMYwpAcAKvmSLuEfYyg_BFX-bWDBuQJyTmzEhLo-SPdR7RChE77D2x40aY6gmnXWt3HUaxYdlzvw5CruLJMrP-Z8Pr4I7enJA2is2qlJiO6WEIqINd3O52hsL41I8We9LXu_bjhUCIfbSL_7fo2iaQ908NbnZMcX0OLvqSe5LboQpRgDxldigjBLU09e1Z579PeutKeTznllMutpLySFUQgoT-IKYHVlmgUbZjtVmyQATKjtMJwnyLT_UG60MEVdj2jqcfW35eN66R4xrhxaVNdhf8SOcMKpmYGMg6bNV3l7tQZwj6QOL1ppUW5Ml6JT_trkIPxXXjYtT0eMYm3fgjCcF2206b2ueOpjCNqwgtNlyCGfy0HrjVVLURMNVb9Q1LtcaRUon-6gP7P5TwqUOOIaXSIi6B86hSZJH5v165Fe32RmB0CqW-aseQGleEukCkHsbMwzXspzAlj5LutvIpZdp_g9-2orI3HcAu9bRxK6r1Rmej9_-ki9zV3iv7CD4UZRROh-ZRiJyBNkZ8hPJRpjlgcBq8F6Nn2A0GkAIIZew7lIw0r44Qb9fjbUNvvQEEm6FCefaWgdQyxBM6biaapOocPJ7CKFa3K7u_EZrjyh9blM_SrRMEi7JkZikbdlEfxHX1T12IXUrf-FOq2jW3oAE9UuJUk4rd-if_bAH4edao3Lm6paXNOdCxUjHLjhitfPyt6w_HgbIR_m00)

UML Source
----------
Expand All @@ -22,6 +22,7 @@ Boundary(aws, "AWS GovCloud") {
Container(worker_app, "TTA Smart Hub Worker Application", "NodeJS, Bull", "Perform background work and data processing")
Container(clamav, "File scanning API", "ClamAV", "Internal application for scanning user uploads")
ContainerDb(www_db, "PostgreSQL Database", "AWS RDS", "Contains content and configuration for TTA Smart Hub")
ContainerDb(elasticsearch, "Elasticsearch", "AWS Elasticsearch", "Contains a copy of content used for searching TTA Smart Hub")
ContainerDb(www_s3, "AWS S3 bucket", "AWS S3", "Stores static file assets")
ContainerDb(www_redis, "Redis Database", "AWS Elasticache", "Queue of background jobs to work on")
}
Expand Down Expand Up @@ -50,6 +51,8 @@ BiRel(www_app, www_s3, "reads/writes data content", "vpc endpoint")
BiRel(worker_app, www_s3, "reads/writes data content", "vpc endpoint")
Rel(www_app, www_redis, "enqueues job parameters", "redis")
BiRel(worker_app, www_redis, "dequeues job parameters & updates status", "redis")
BiRel(worker_app, elasticsearch, "submits content for indexing", "elasticsearch")
BiRel(www_app, elasticsearch, "submits queries for data", "elasticsearch")
Boundary(development_saas, "CI/CD Pipeline") {
System_Ext(github, "GitHub", "HHS-controlled code repository")
System_Ext(circleci, "CircleCI", "Continuous Integration Service")
Expand All @@ -65,7 +68,7 @@ Lay_R(HSES, aws)
Instructions
------------

1. [Edit this diagram with plantuml.com](http://www.plantuml.com/plantuml/uml/dLPHRnit37xFh-3oKAH04sDfNpOCGvtOJJCagvjpsvS2WQoZRIqwIHKfkyRG_pvHELzySJCKVKYiayIF7qNoaruJgyYfiGU_6ATjgX6Mp85-7moYMfyi32_JB156xHsZunFjcu6ger5kPLD8W_DNnnEh77-ulXfOnKYSfsxFlMDb7CQJ8DXis29CfqEe6XKPtgOPp7nojOeRM1bS7qmmadT7eVmuj2_Wy67h1y9uc6U63j3Lnq87_1nu-GZpyFaMfyOLMf_HUZYZxUnXRtgLGNs4peP0kHekfjSPxwnbUQJM90m-LN3XL-VMf_hmEVnx0D3jq3AvyAkyhGnFyhZT0r1jYB6v7NzbGRfLThQx3QnNzV5CMULqOLSS3Q_ECeE-TUVbLWNJWnXEBlzdT_I9osdyySog3KRw4nvCxnT9_9R8u8t453V0KStdHWDXN1cDBVx3cR2V2LUj8zQ65HllkjKkT82k6exMwtKc7dey-cNn2MSm3C4QNU24qz--nh-g5p3-6Y9IJiCDAisoCZ8KCinzmhIt5ZLJ0QmLgnEuJfPDi0Z64SlP4iJad76B7CeUDn_lCBRq7f4gIq_nEalMolS4uzp71cPAtosY70C8PY4dV214hv7e--wrLOsIudbZB09fLNYMj9Qyg6RuKKQNZZ4wPK4zi87izy6az4SAKbEqxsWqArcePyBO4oyYLk-lOAHOpHJXjlXkHfKCMpE2jl_RatPpiqYnyztFuPC-wMKBNhs4MT9to8VTH6b9zX49I2f9mrpEGH4XFns5r9tqvOtyxiKmjeeoajBBClK-OKBmyzq4_UDd1Epl4-LTw-ZRWVnzExZlfAEtrRZAzl9JQ1gti0YLMJXuvpwxwpT3M1a5KzFWyPaYhALIm4UON7u4hMBzEU-gV8gwkB-8oZ5YeRihCGjIjwGKIGrWVJEydXoh4A9VPS1AgQfqLA-z7QuZMgEb674DDpBUPisuv1F1jNU6VImrboFAMtR5SGVWMZcrG0ZtHIpD0i_0-iKYcmfckHRWOZmTJDvL-T-7wM0qlXfFnt3qwjNBvnbTiDi6xo5733uo0QF51Zbk18vrw1dkAF-3G3Q5KzVGFqR3UKQ8Ph5ay0wiSNS7Gsj12xgMYYMeCHZDM8EA2EXqt0JvMXjUmlIgWbzFJY4vYqJWUObnRGX1SEkJ0tHr6GhhhgftxwtQhvtqNbl0p4el6tHCFVgzuBk8z_KwHF_LY6xXbuJ4zEEfsCFgSlFpe3sHx1HBb8TAk3j8WW97lprXsxPKH8u6LzZrn5wQzwjhnrmbNgAJ5cUaedjY-swOfq_twwerMc4qymsOMYnvinzxv_BCz9nsiXbHrJHOHoDUXJ4XGqJjOqq25-YBxSoxg7yMeKz0gctVV2lFOhiA6j3LmHl7-on-5cQFo5PwqNqHNIMHKWWggWQtZucovIaA7K2DtmM0Pv129jLU2Ac7s6bcgsNINrljFBySd8zWOW9Qut1VDCkRLf9mOLXkAr7sjzMnX1wzjIZtMuqWgaM6lRna-W3QH6rH6_5LXkUNZqHNxkX5UprMtTjfJtqx9fCqiuQMnR50vGMW_7rqUjgMViTwrxbl6iEmzFvE5CdCfQMtZ80znEJW_ZH2Tne5x07aqNDoXC7wJUzfuHrGqXg9vicsS1au8rsfpUteaN3VpzwqLsQPU8skJetz3m00)
1. [Edit this diagram with plantuml.com](http://www.plantuml.com/plantuml/uml/dLPVRnit37_Ff-3oKAH04sDfNpOCGvtOJJCagvjpsvS2WIpHjXQTf8gI7MFeTvyedo--NBCM1HAVJaV--AF4FtnDKOIOQdFmYxRI98MmZD7JxuD14EkJXOxBD4k4GJeRqSOJwUg1K44BkpB9fy7vg-E94JP-kBuQ616HOjwwFS_IGbiC9vwneuu6OJeTGbLdS_0spM3cabKYR62bSNqmmK3E7ldoi6XUmE63w0V2S9XtXXRGgcEc0suESVa8y_3v3aVX2etBQ1sSqMxtiBEzog0kGAj38Ao2Y-brnZjhM1w9DJ63JnLSkDMvSKaTFeT_3m1Q2ScR2xVAOcLvajUD342reOZrxVW-CjAjCBDzCJ1EgEEPCCBApAunDBmwomowsgsLM-5r1s8ukVmVwU1IBAVnnpCZDXZeJtXGtNuajbiyWpViKzX0368-rn88mqfBRF0VpYIw6NELCg3IKKJjRFkrTHsWZOQZzNgTYViEZsvOVC8P3BqtMXOuvlItK_XNzGB6zpuWvMkuGI4ZVnbfyXbcUs7I6OCoKW5K8eeJk4ucQcyGjAN8PqiGahNQBh8hFMukt65ew3qYLJuznUii6SFl4mnp5sgO2Nct27mCm9XChl017vn4ec-RbqRKWeFdZJO894NXMQwrt8keXnzvTMaZ1YiCY0SsaEq-g9JyI9vZYlg6HhDo2sh6Z1t5HS3g_HMCCi6P8Dm5tyseucMZHj3aVtFITguNARIsUsdOjub6KDIIK0IvpC3Z_VUTqTvcQrvaetu3RjuIIOIgsDvAi-FVPy6Uqyl6MlKIPadUOJniDtXLHHUGW3YU9SpvDWGHngVFagEVgyCDFtidsJWbvB9ilK-Oa1tP2vD_t8mWkcsuEDiO-dh0_ryFjlbyzBOQLrcV7uLczjWWL7fXuPtzRA_V366b1KnrZSTdWb1nl88BCBdy2584zJbBLIu5MSnV164ZOg6xN-WM96v92E98H7Kpl9uS2--3MsN0IeYgv7Mdo5XS1pHQSahW6cvulRiyRPDB1DVEwkX2YVpiKHRPBz5x0DV2YWM2twOIfomy0-CMYwpAcAKvmSLuEfYyg_BFX-bWDBuQJyTmzEhLo-SPdR7RChE77D2x40aY6gmnXWt3HUaxYdlzvw5CruLJMrP-Z8Pr4I7enJA2is2qlJiO6WEIqINd3O52hsL41I8We9LXu_bjhUCIfbSL_7fo2iaQ908NbnZMcX0OLvqSe5LboQpRgDxldigjBLU09e1Z579PeutKeTznllMutpLySFUQgoT-IKYHVlmgUbZjtVmyQATKjtMJwnyLT_UG60MEVdj2jqcfW35eN66R4xrhxaVNdhf8SOcMKpmYGMg6bNV3l7tQZwj6QOL1ppUW5Ml6JT_trkIPxXXjYtT0eMYm3fgjCcF2206b2ueOpjCNqwgtNlyCGfy0HrjVVLURMNVb9Q1LtcaRUon-6gP7P5TwqUOOIaXSIi6B86hSZJH5v165Fe32RmB0CqW-aseQGleEukCkHsbMwzXspzAlj5LutvIpZdp_g9-2orI3HcAu9bRxK6r1Rmej9_-ki9zV3iv7CD4UZRROh-ZRiJyBNkZ8hPJRpjlgcBq8F6Nn2A0GkAIIZew7lIw0r44Qb9fjbUNvvQEEm6FCefaWgdQyxBM6biaapOocPJ7CKFa3K7u_EZrjyh9blM_SrRMEi7JkZikbdlEfxHX1T12IXUrf-FOq2jW3oAE9UuJUk4rd-if_bAH4edao3Lm6paXNOdCxUjHLjhitfPyt6w_HgbIR_m00)
1. Copy and paste the final UML into the UML Source section
1. Update the img src and edit link target to the current values

Expand Down
53 changes: 53 additions & 0 deletions docs/elasticsearch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Elasticsearch

Elasticsearch is a document-oriented data store with a very advanced [search API][es-search-api]. In short: You throw JSON data at Elasticsearch, then later you can search that data. By default, Elasticsearch writes happen with eventual consistency--if you write some data, then ask Elasticsearch for it back, you might not get it, or it might not look like what you previously wrote.

The TTA Hub uses [Elasticsearch][elasticsearch] to index Activity Reports and provide full-text search capabilities. In staging and production, this functionality is provided by [Cloud.gov][cg-elasticsearch]. For local development, a single Elasticsearch node is included in the `docker-compose` environment. Elasticsearch is a **secondary** datastore that should be regarded as ephemeral--over time, all Activity Report data that is written to Postgres should _also_ be written to Elasticsearch and available for searching, but it may also occasionally go out-of-sync with Postgres or need to be completely re-indexed.

## Integration Details

The application communicates with the Elasticsearch cluster in a manner similar to how it communicates with Postgres. Client configuration details (endpoints, access keys, etc.) are provided via the `VCAP_SERVICES` environment variable in the cloud.gov environment. Application code creates an [Elasticsearch client][es-client], configures it appropriately, then uses it to submit and query for data.
### Use of Sequelize Hooks

The Elasticsearch code uses [Sequelize hooks][sequelize-hooks] to know _when_ to write data to Elasticsearch. As Activity Reports are saved (or destroyed), these custom hooks schedule Worker jobs to propagate the changes from Postgres to Elasticsearch.

### Worker

Only the Worker (background task queue) writes to Elasticsearch. The reasons for this are:

1. **A failed Elasticsearch write should not interrupt the user's day.** If we fail to write to the application's _primary_ data store (Postgres), the user should know their data has not actually been saved. But Elasticsearch is a secondary data store, and absolutely not the user's problem.
2. **Elasticsearch writes will be eventually consistent anyway.** It is not guaranteed that, immediately after a write, a request for the same data will return what was written. So introducing an additional delay to the write for worker processing is not a big deal.

### Mappings

It is possible not to tell Elasticsearch about the shape of your data, and let it infer a schema from what you send it. In practice though, you will want to configure [mappings][es-mappings] that instruct Elasticsearch how certain data fields should be stored. Mappings are used to answer questions like:

1. Does the text in this field need to be full text searchable (like the "Comments" field on a feedback form) or can it be restricted to exact matches only (like the "Department" field on a feedback form)?
2. What format is used by the application to represent dates and times?

Mappings are configured in application code in [`lib/elasticsearch/mappings.js`](../src/lib/elasticsearch/mappings.js).

### Ingest Pipelines

If your data needs to be transformed or normalized before storage, Elasticsearch provides a feature called [Ingest Pipelines][es-pipelines] that can be used to do this processing. Example uses of pipelines are:

- Stripping HTML tags from fields containing rich-formatted text (you likely don't want user input matching against raw HTML tags)
- Indexing text content inside common document formats (.pdf, .docx, etc.) using the [Ingest Attachment Processor plugin][ingest-attachment]

Pipelines are configured in application code in [`lib/elasticsearch/pipelines.js`](../src/lib/elasticsearch/pipelines.js).

## Hazards and Pitfalls

### Amazon / Elastic conflict

Elasticsearch in cloud.gov is [AWS OpenSearch][aws-opensearch] (previously AWS Elasticsearch) under the hood. "OpenSearch" is AWS's fork of the Elasticsearch product. Newer versions of official Elastic clients have added code to detect when they are communicating with forked Elasticsearch servers and refuse to run. For now, pinning `@elastic/elasticsearch` to version 7.13.0 (the last version without this check) works. In the future, we may want to evaluate any official clients published by AWS / OpenSearch.

[elasticsearch]: https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html
[es-search-api]: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html
[cg-elasticsearch]: https://cloud.gov/docs/services/aws-elasticsearch/
[es-client]: https://www.npmjs.com/package/@elastic/elasticsearch/v/7.13.0
[sequelize-hooks]: https://sequelize.org/master/manual/hooks.html
[es-pipelines]: https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html
[ingest-attachment]: https://www.elastic.co/guide/en/elasticsearch/plugins/current/ingest-attachment.html
[aws-opensearch]: https://aws.amazon.com/opensearch-service/
[es-mappings]: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html
4 changes: 4 additions & 0 deletions elasticsearch/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
FROM docker.elastic.co/elasticsearch/elasticsearch:7.4.0

# NOTE: ingest-attachment is supported by default on AWS Elasticsearch.
RUN elasticsearch-plugin install --batch ingest-attachment
1 change: 1 addition & 0 deletions manifest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ applications:
- ttahub-((env))
- ttahub-redis-((env))
- ttahub-document-upload-((env))
- ttahub-elasticsearch-((env))
processes:
- type: web
instances: ((web_instances))
Expand Down
2 changes: 2 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,9 @@
"supertest": "^6.1.3"
},
"dependencies": {
"@acuris/aws-es-connection": "^2.3.0",
"@babel/runtime": "^7.12.1",
"@elastic/elasticsearch": "7.13.0",
"adm-zip": "^0.5.1",
"aws-sdk": "^2.826.0",
"axios": "^0.21.1",
Expand Down
5 changes: 4 additions & 1 deletion src/index.js
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
require('newrelic');
// require('newrelic');

/* eslint-disable import/first */
import app from './app';
import { auditLogger } from './logger';
import { initElasticsearchIntegration } from './lib/elasticsearch';
/* eslint-enable import/first */

initElasticsearchIntegration();

const port = process.env.PORT || 8080;
const server = app.listen(port, () => {
auditLogger.info(`Listening on port ${port}`);
Expand Down
Loading