Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue database needs a migration script. #968

Closed
3 tasks
karlcow opened this issue Mar 16, 2016 · 7 comments
Closed
3 tasks

Issue database needs a migration script. #968

karlcow opened this issue Mar 16, 2016 · 7 comments

Comments

@karlcow
Copy link
Member

karlcow commented Mar 16, 2016

When we restart the project, when we change the schema, etc. We need to be able to keep the data and reinstantiate the databse.

  • DB backup
  • Stopping and Restarting the project
  • Changing the DB schema

This issue blocks #865

@hallvors
Copy link
Contributor

@karlcow Do we need a DB migration script? I'd just add the domain stuff to the existing dump issues to db-script and run it for a new sync from GitHub..or something like that.

@dshgna
Copy link
Contributor

dshgna commented Mar 18, 2016

@karlcow @hallvors So how shall we proceed? :)

@hallvors
Copy link
Contributor

I suggest this (to get us started): Write a script that

  1. Backs up the existing db file
  2. Puts the webcompat.com site in some sort of "maintenance mode" (stopping it?) so no new reports can be filed through the site temporarily
  3. Deletes the current DB
  4. Runs the code in the dump to db script to grab a fresh GitHub data dump
  5. Starts the project again

Timing-wise I suppose this needs to run during deployment of a new version that relies on an updated DB schema.

@karlcow
Copy link
Member Author

karlcow commented Mar 24, 2016

@hallvors this is exactly a migration script. So yes we need it.

Plus to avoid the maintenance pain for @miketaylr who at this time manage the deployment.

  • Do we have a dump to DB script?
  • If yes, how long does the current dump to db take? (Roughly 2500 issues currently and growing)

The proposed script above by @hallvors needs to switch step 1. and 2. Or more exactly

Once webcompat.com is stopped.

  1. Backup the existing DB file.
  2. Delete the DB.
  3. "Dump to DB": Parse GitHub (pagination involved, probably many HTTP requests)
    1. Check the HTTP status code for each request
    2. Grab the pagination link
    3. Check that the body contains the required information
    4. Report any issues during the dump to DB.

Notes:

  • The Dump to DB probably needs to be a separate script from the delete backup one.
  • It requires tests
  • It requires documentation on how to process
  • it requires also a maintenance page for webcompat.com capturing all HTTP requests.

This is not necessary simple. The good news is that for now because of the way the DB is aka readonly for webcompat.com features. It will not break the system.

@hallvors
Copy link
Contributor

This is the dump-to-db script: https://github.com/webcompat/issue_parser/blob/master/dump_webcompat_to_db.py
It currently depends on stuff in https://github.com/webcompat/issue_parser/blob/master/extract_id_title_url.py but the stuff it actually uses should probably be extracted, since that script does some additional stuff useful to arewecompatibleyet.com but not required for dumping to db.

@hallvors
Copy link
Contributor

BTW, what's the point of "pausing" the site? Issues might also get filed on GitHub (through the GitHub UI, not ours) while the script is running, you can't pause GitHub.. The chance of that happening is low I guess, but you're not guaranteed a perfect duplication even if you leave the site running.

@miketaylr
Copy link
Member

Setting "outreachy-project" given its relationship with #865.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants