Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up postgresql on new server #955

Closed
mlissner opened this issue Apr 24, 2019 · 9 comments
Closed

Set up postgresql on new server #955

mlissner opened this issue Apr 24, 2019 · 9 comments
Assignees

Comments

@mlissner
Copy link
Member

This should be fairly routine. The first idea is to set it up as a secondary read replica. With that done, we'll see how performance improves. If it's all good, we'll swap over to using this DB as the read/write replica.

One question is whether this should be dockerized. Unsure for the moment. I suspect not.

@mlissner mlissner self-assigned this Apr 24, 2019
@mlissner
Copy link
Member Author

OK, I've read enough scary things about running postgresql in docker on prod to not do that. Hence, this is just about getting basic replication going for the moment.

@mlissner
Copy link
Member Author

The new server has the DB folder mounted at /database. To change this is pretty easy, especially on a new install. It's just a tweak in the config file as explained here:

https://www.digitalocean.com/community/tutorials/how-to-move-a-postgresql-data-directory-to-a-new-location-on-ubuntu-18-04

Did it, and confirmed with:

show data_directory;

@mlissner
Copy link
Member Author

Just updated pg_hba.conf to copy the old CL server but to ensure that all connections use SSL as in #903.

@mlissner
Copy link
Member Author

Used vimdiff to diff the old settings file with the new. Should be nice and beefy now.

@mlissner
Copy link
Member Author

Connection is confirmed from the remote server. Now onwards to replication.

@mlissner
Copy link
Member Author

OK, as usual this got a little tricky around connections, but the sync is now in progress. The issue this time is that:

  1. Really long passwords fail, ugh.
  2. You need to be connected to the right DB to create the subscription

Fun stuff as always.

@mlissner
Copy link
Member Author

mlissner commented Apr 25, 2019

OK, so this created some progress, with the new DB powering the API today, but we still had some downtime due to the normal DB being over burdened. We need to switch to the new server as our main DB.

Process:

  1. Stop everything that writes to the DB:

    • Celery
    • Scrapers (and disable in cron)
    • Front end (apache)
    • Supervisor stuff
  2. Ensure that the subscriber is up to date (it should soon be with no write permissions). This query, run on the current publisher, will show you if subscribers are synced:

    SELECT 
        slot_name,
        confirmed_flush_lsn, 
        pg_current_wal_lsn(), 
        (pg_current_wal_lsn() - confirmed_flush_lsn) AS lsn_distance
    FROM pg_replication_slots;
    
  3. Tear down the current subscription and publication.

  4. Reset the sequences on the old subscriber using this tutorial: https://wiki.postgresql.org/wiki/Fixing_Sequences

  5. Swap CourtListener to use the new DB as primary. Test. Things should be working now.

  6. Create a new publication going in the opposite direction. (this is needed to keep existing subscriptions live). Be sure to use copy_data=false when setting this up.

  7. Restart services from above.


Remaining to do:

  1. Create a publication on the new publisher to the AWS DB and switch our firewall to direct 5432 to the new publisher.

  2. Delete the old publisher if no longer useful.

@mlissner
Copy link
Member Author

Note that replicating to original DB and then to the AWS DB is turning out to be a good problem because it ensures that there are no issues replicating before values get to the AWS DB.

The table that's currently giving the most trouble is django_cache, which is throwing errors like:

2019-04-26 16:39:47.558 PDT [7811] LOG:  logical replication apply worker for subscription "replica" has started
2019-04-26 16:39:47.575 PDT [7811] ERROR:  duplicate key value violates unique constraint "django_cache_pkey"
2019-04-26 16:39:47.575 PDT [7811] DETAIL:  Key (cache_key)=(:1:views.decorators.cache.cache_header..1871b1dfde84b136591e76bac5ff5598.America/Los_Angeles) already exists.
2019-04-26 16:39:47.578 PDT [21667] LOG:  worker process: logical replication worker for subscription 68725186 (PID 7811) exited with exit code 1

I think this error is currently localized only to this table and I think it happened because the table somehow got cache values on the original DB after the subscription was torn down, but I'm not entirely sure. In any case, since it's just a cache table of views, with about 300 entries, I nuked it:

courtlistener=> delete from django_cache where cache_key like ':1:views.decorators.cache%';
DELETE 357

@mlissner
Copy link
Member Author

mlissner commented Jul 1, 2020

Since we have issue #1115, I'm closing this one. That one can take the lead.

@mlissner mlissner closed this as completed Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant