-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(db): Setup neo4j data migrations #2828
Conversation
Implement a migration to merge duplicate user accounts with reactive programming. Those duplicate user accounts existed, because around 40 users have decided to register again while we experienced a bug related to normalized emails in our database.
* Add unique index for `Migration`s * Fix proper use of `next` callback. First argument is potential error. * Update migration template
- having duplicate Location nodes in the production database blocks us from adding a unique constraint, so that Locations are not created which have the same id.
- it's more human readable and consistent with all other nodes in production
… into neo4j_data_migrations
… into neo4j_data_migrations
- create command should be run with --date-format to be more human readable, and --template-file to use our template instead of migrate's default - rename migrations - rename createdAt to migratedAt to remove ambiguity - do not merge relationships for Location nodes as we don't want to create duplicate relationships - use singular locationId as it's iterating one at a time
We have to figure out if `mergeRels: true` is actually avoiding duplicate relationships 🤔. Before: (l1)-[:IS_IN]->(l2) (l1)-[:IS_IN]->(l3) After: (l1)-[:IS_IN]->(new) (l1)-[:IS_IN]->(new)
… into neo4j_data_migrations
…uman-Connection into neo4j_data_migrations
Test summaryRun details
View run in Cypress Dashboard ➡️ This comment has been generated by cypress-bot as a result of this project's GitHub integration settings. You can manage this integration in this project's settings in the Cypress Dashboard |
{% tabs %} | ||
{% tab title="Docker" %} | ||
```bash | ||
docker-compose exec backend yarn run db:migrate init |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authored by mattwr18
Jan 20, 2020
Outdated (history rewrite) - original diff
@@ -53,6 +53,27 @@ can issue GraphQL requests or access GraphQL Playground in the browser.
![GraphQL Playground](../.gitbook/assets/graphql-playground.png)
+### Database Indices and Constraints
+
+Database indices and constraints need to be created when the database and the
+backend is running:
+
+{% tabs %}
+{% tab title="Docker" %}
+```bash
+docker-compose exec backend yarn run db:migrate init
I'm getting this error when running this command:
Error: ERROR_TRANSACTION_FAILED
at new TransactionError (/nitro-backend/node_modules/neode/build/TransactionError.js:21:126)
at /nitro-backend/node_modules/neode/build/index.js:456:33
at processTicksAndRejections (internal/process/task_queues.js:94:5)
at Store.init (/nitro-backend/src/db/migrate/store.js:18:7) {
errors: [
{
query: 'CREATE CONSTRAINT ON (model:Location) ASSERT model.id IS UNIQUE',
params: {},
error: [Neo4jError]
},
{
query: 'CREATE CONSTRAINT ON (model:Donations) ASSERT model.id IS UNIQUE',
params: {},
error: [Neo4jError]
},
{
query: 'CREATE CONSTRAINT ON (model:Report) ASSERT model.id IS UNIQUE',
params: {},
error: [Neo4jError]
}
]
}
do I need to delete all indexes on the database before running this command? or does this error message mean something more to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authored by mattwr18
Jan 21, 2020
looking into this further @roschaefer ... this has nothing, it seems, to do with Docker
vs. local setup, but production vs seed data.
It appears there is an error thrown when setting up the unique constraints on Location
and then the rest of the 4 👆 don't get run. I ran them individually in the neo4j
browser and got this error for Location
, the other 4 were successful.
Unable to create CONSTRAINT ON ( location:Location ) ASSERT location.id IS UNIQUE:
Both Node(48981) and Node(186812) have the label `Location` and property `id` = 'place.4463223470618060'
Looking this Location
up by it's id results in 4 identical Locations
in production data, that look like
{
"lng": 48.06028,
"nameES": "Villingen-Schwenningen",
"nameFR": "Villingen-Schwenningen",
"nameIT": "Villingen-Schwenningen",
"nameEN": "Villingen-Schwenningen",
"type": "place",
"namePT": "Villingen-Schwenningen",
"nameDE": "Villingen-Schwenningen",
"nameNL": "Villingen-Schwenningen",
"name": "Villingen-Schwenningen",
"namePL": "Villingen-Schwenningen",
"id": "place.4463223470618060",
"lat": 8.45861
}
I guess we don't really need to run init
in production, right? should we delete these duplicate records anyways?
Looking at our db schema in production confirms that we do no have any CONSTRAINT
on
Location`, so I guess we want to set that up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authored by roschaefer
Jan 21, 2020
@mattwr18 yes, we want to set that up. I used neode
to setup unique constraints and if that fails on production data, then we had one constraint missing on production
. Maybe we can run a migration before we setup the constraints. Sounds odd but is the best solution that comes to my mind right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authored by mattwr18
Jan 22, 2020
I've added a migration that runs first... but we would need to run yarn db:migrate up
before running yarn db:migrate init
... otherwise the latter will fail.
Also, I changed the timestamp of the merge_duplicate_location_nodes
so that it run before the merge_duplicate_user_nodes
, but actually that's not important.
We have talked about adding the migrations we have already run on production data to this folder so that we have a comprehensive list of every migration we have run... should those timestamps be altered as well to appear at the top of the list and be run first?
Also, when I added the unique constraints on Migration
node, I realized it failed as it was creating multiple nodes for the same migration.
I've updated it to MERGE
by title, then ON CREATE SET
the properties and changed the timestamp to createdAt
so that we have a record of when the migration was run in production, and it's human readable.
```bash | ||
# in folder backend/ | ||
# make sure your database is running on http://localhost:7474/browser/ | ||
yarn run db:migrate init |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authored by mattwr18
Jan 20, 2020
Outdated (history rewrite) - original diff
@@ -53,6 +53,27 @@ can issue GraphQL requests or access GraphQL Playground in the browser.
![GraphQL Playground](../.gitbook/assets/graphql-playground.png)
+### Database Indices and Constraints
+
+Database indices and constraints need to be created when the database and the
+backend is running:
+
+{% tabs %}
+{% tab title="Docker" %}
+```bash
+docker-compose exec backend yarn run db:migrate init
+```
+{% endtab %}
+
+{% tab title="Without Docker" %}
+```bash
+# in folder backend/
+# make sure your database is running on http://localhost:7474/browser/
+yarn run db:migrate init
when I ran this locally, without Docker
... it seems to have run successfully, based on this output
yarn db:migrate init
yarn run v1.19.0
warning ../../package.json: No license field
$ yarn run __migrate --store ./src/db/migrate/store.js init
warning ../../package.json: No license field
$ migrate --compiler 'js:@babel/register' --migrations-dir ./src/db/migrations --store ./src/db/migrate/store.js init
migrations dir : /home/mattwr18/code/Human-Connection/backend/src/db/migrations
Successfully created database indices and constraints!
init : undefined
Done in 3.64s.
|
||
class Store { | ||
async init(next) { | ||
const neode = getNeode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authored by roschaefer
Jan 23, 2020
Outdated (history rewrite) - original diff
@@ -35,7 +35,7 @@ class Store {
const session = driver.session()
const readTxResultPromise = session.readTransaction(async txc => {
const result = await txc.run(
- 'MATCH (migration:Migration) RETURN migration {.*} ORDER BY migration.timestamp DESC',
+ 'MATCH (migration:Migration) RETURN migration {.*} ORDER BY migration.createdAt DESC',
One could argue we should save two attributes: createdAt
and migratedAt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authored by mattwr18
Jan 23, 2020
maybe, but I guess those will be the same cause the Migration
is only created in the database when it is run for the first time... after that, since we use merge, it is not updated.
I thought about calling it migratedAt
, and wouldn't be opposed to it, but thought that createdAt
could also be ok...
title: { type: 'string', primary: true, token: true }, | ||
description: { type: 'string' }, | ||
migratedAt: { type: 'string', isoDate: true, default: () => new Date().toISOString() }, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authored by roschaefer
Jan 23, 2020
Outdated (history rewrite) - original diff
@@ -1,5 +1,5 @@
export default {
title: { type: 'string', primary: true, token: true },
description: { type: 'string' },
- timestamp: { type: 'number', unique: true },
+ createdAt: { type: 'string', isoDate: true, default: () => new Date().toISOString() },
Yeah, so this is certainly migratedAt
.
🍰 Pullrequest
I use #2381 as a use case to add consistent and reliable data migrations to our backend.
Issues
Todo
migrate
in docker image (production
build stage)migrate
db/
yarn run prod:migrate init
onpost-install
andyarn run prod:migrate up
onpost-upgrade
I guess we have to do the latter manually (during a downtime)