Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Builds: drop the jobset and project columns #1093

Merged
merged 5 commits into from
Jan 17, 2022

Conversation

grahamc
Copy link
Member

@grahamc grahamc commented Jan 9, 2022

This is a work in progress PR which is not carefully validated. The tests do pass which is nice to see. However, I've barely even started the webserver or clicked around.

The project and jobset columns were the only foreign key to a jobset and project. #969 added a jobset_id as a replacement, but didn't go all the way to dropping the now duplicate columns. This PR finishes that.

I think a good way to apply this PR is to pull out individual commits, updating individual systems and merging / deploying them one at a time, validating in production that the change didn't break anything. This way, we can easily revert changes if it is the easiest solution.

Several commits in this PR implemented the easiest possible solution to make a test pass. Those patches should be undone, and reimplemented. One good example of this is the json_hints change which critically breaks the API. Another example is the dropping of several indexes without consideration for how it impacts the performance properties of Hydra.

--

before this PR:

hydra=> select pg_table_size('builds');
 pg_table_size 
---------------
   72219074560
(1 row)

after this PR followed by a vacuum full builds (which took 3734085.204 ms):

hydra=> select pg_table_size('builds');
 pg_table_size 
---------------
   56651767808
(1 row)

a reduction of 14GB.

@grahamc grahamc force-pushed the builds-jobset-project branch 3 times, most recently from 8ad13b8 to 7d5be1b Compare January 11, 2022 15:59
@grahamc grahamc force-pushed the builds-jobset-project branch 2 times, most recently from 0e160cb to 989e5e0 Compare January 14, 2022 16:06
@grahamc grahamc force-pushed the builds-jobset-project branch 2 times, most recently from e80e789 to ebed42f Compare January 14, 2022 17:14
@grahamc grahamc force-pushed the builds-jobset-project branch from ebed42f to 7521791 Compare January 14, 2022 18:04
@grahamc grahamc force-pushed the builds-jobset-project branch 2 times, most recently from 70dcf9b to c1e08e2 Compare January 14, 2022 21:21
@grahamc grahamc force-pushed the builds-jobset-project branch from c1e08e2 to 1e1cafb Compare January 14, 2022 21:38
@grahamc grahamc force-pushed the builds-jobset-project branch 4 times, most recently from 04d54e4 to 1d5e971 Compare January 15, 2022 02:19
@grahamc grahamc force-pushed the builds-jobset-project branch from 1d5e971 to ff8089d Compare January 15, 2022 17:22
@grahamc grahamc force-pushed the builds-jobset-project branch from ff8089d to 8bbd17d Compare January 15, 2022 17:51
@grahamc grahamc force-pushed the builds-jobset-project branch from 8bbd17d to 79fcf93 Compare January 15, 2022 19:19
@grahamc grahamc marked this pull request as ready for review January 17, 2022 14:59
@grahamc
Copy link
Member Author

grahamc commented Jan 17, 2022

welp, here goes

@grahamc grahamc merged commit ed1b532 into NixOS:master Jan 17, 2022
@grahamc grahamc deleted the builds-jobset-project branch January 17, 2022 15:12
@grahamc
Copy link
Member Author

grahamc commented Jan 17, 2022

Timestamps of the deploy:

Jan 17 16:23:04 ceres systemd[1]: Finished httpd-config-reload.service.
Jan 17 16:23:04 ceres hydra-init[103868]: upgrading Hydra schema from version 79 to 80
Jan 17 16:23:04 ceres hydra-init[103868]: executing SQL statement: drop index IndexBuildsOnJobsetIsCurrent
Jan 17 16:23:04 ceres hydra-init[103868]: executing SQL statement: drop index IndexBuildsOnJobIsCurrent
Jan 17 16:23:04 ceres hydra-init[103868]: executing SQL statement: drop index IndexBuildsOnJobset
Jan 17 16:23:04 ceres hydra-init[103868]: executing SQL statement: drop index IndexBuildsOnProject
Jan 17 16:23:04 ceres hydra-init[103868]: executing SQL statement: drop index IndexBuildsOnJobFinishedId
Jan 17 16:23:04 ceres hydra-init[103868]: executing SQL statement: alter table Builds
Jan 17 16:23:04 ceres hydra-init[103868]:     drop column project,
Jan 17 16:23:04 ceres hydra-init[103868]:     drop column jobset

@grahamc
Copy link
Member Author

grahamc commented Jan 17, 2022

Dropping these 2 columns and 5 indexes cut h.n.o's on-disk database size by 187G (uncompressed) / 51G (compressed):

[root@haumea:~]# zfs get logicalreferenced,referenced rpool/safe/postgres@2022-01-17T15:20:00Z
NAME                                      PROPERTY           VALUE     SOURCE
rpool/safe/postgres@2022-01-17T15:20:00Z  logicalreferenced  831G      -
rpool/safe/postgres@2022-01-17T15:20:00Z  referenced         374G      -

[root@haumea:~]# zfs get logicalreferenced,referenced rpool/safe/postgres@2022-01-17T15:25:00Z
NAME                                      PROPERTY           VALUE     SOURCE
rpool/safe/postgres@2022-01-17T15:25:00Z  logicalreferenced  644G      -
rpool/safe/postgres@2022-01-17T15:25:00Z  referenced         323G      -

@grahamc
Copy link
Member Author

grahamc commented Jan 17, 2022

Getting logs from postgres starting with the migration:

journalctl -u postgresql --since "2022-01-17 16:23:04"

useful for: https://github.com/NixOS/nixos-org-configurations/wiki/pgbadger

re:

socat -u SYSTEM:'journalctl --output=short-iso -u postgresql --since "2022-01-17\\ 16:23:04" | pv --cursor --name "Pre-compression" | zstd -9 | pv --cursor --name "Post-compression"' TCP4-LISTEN:10050,so-bindtodevice=wg0,reuseaddr

@grahamc
Copy link
Member Author

grahamc commented Jan 17, 2022

rpool/safe/postgres  used                         719G
rpool/safe/postgres  referenced                   326G
rpool/safe/postgres  compressratio                2.23x
rpool/safe/postgres  logicalreferenced            645G
rpool/safe/postgres  refcompressratio             2.00x
rpool/safe/postgres  written                      4.23G
rpool/safe/postgres  logicalused                  1.52T
hydra=# vacuum (FULL, VERBOSE) builds;
INFO:  vacuuming "public.builds"
INFO:  "builds": found 18597 removable, 159744802 nonremovable row versions in 8813366 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU: user: 119.30 s, system: 217.31 s, elapsed: 480.03 s.
NAME                 PROPERTY           VALUE     SOURCE
rpool/safe/postgres  used               545G      -
rpool/safe/postgres  referenced         263G      -
rpool/safe/postgres  compressratio      2.24x     -
rpool/safe/postgres  logicalreferenced  507G      -
rpool/safe/postgres  refcompressratio   1.93x     -
rpool/safe/postgres  written            1.31M     -
rpool/safe/postgres  logicalused        1.16T     -

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant