Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MBS-12400: fix non-musicbrainz-schema dumps #2541

Merged
merged 1 commit into from
May 25, 2022

Conversation

mwiencek
Copy link
Member

@mwiencek mwiencek commented May 25, 2022

https://tickets.metabrainz.org/browse/MBS-12400

Due to the $table_noschema addition in 0d38660, we were failing to include the schema prefix in any dumped tables' file names. Because MBImport.pl expects the file names to match the table names as specified in @FULL_TABLE_LIST (which includes schemas for non-musicbrainz-schema tables), this causes the import to fail at finding data files for those schemas.

The faulty change referenced above also has the side effect of causing tables with the same names in different schemas to clobber each other in the dumps. For example, the cover-art-archive dump's art_type file would contain rows from the event_art_archive.art_type table instead!

The only purpose of the $table_noschema change was to ensure that the file names inside dbmirror v2 packets didn't include the "dbmirror2." prefix. It wouldn't cause any issue if they did, but it was preferred to match how dbmirror v1 packets are dumped (whose tables are in the musicbrainz schema, so are not schema-qualified). I moved the re-mapping from dump_table to ExportAllTables where it's needed.

@mwiencek mwiencek changed the base branch from master to production May 25, 2022 15:35
Copy link
Member

@reosarevok reosarevok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes seem sensible, but do check the failing test, it seems related.

@mwiencek
Copy link
Member Author

I should note that this also fixes not being able to import the caa/eaa tables -- the files for these are expected to be schema-qualified in the dumps, and this fixes that.

@mwiencek mwiencek changed the title MBS-12400: fix art_type dump clobbering MBS-12400: fix non-musicbrainz-schema dumps May 25, 2022
@mwiencek
Copy link
Member Author

I updated the commit title and description to reflect that this is a general issue with dumping non-musicbrainz-schema tables. The art_type clobbering is really an obscure side-effect of that (and only discoverable if you try to import these files by hand).

Due to the `$table_noschema` addition in
0d38660, we were failing to include the
schema prefix in any dumped tables' file names.  Because MBImport.pl
expects the file names to match the table names as specified in
`@FULL_TABLE_LIST` (which includes schemas for non-musicbrainz-schema
tables), this causes the import to fail at finding data files for those
schemas.

The faulty change referenced above also has the side effect of causing
tables with the same names in different schemas to clobber each other in
the dumps.  For example, the cover-art-archive dump's art_type file
would contain rows from the event_art_archive.art_type table instead!

The only purpose of the `$table_noschema` change was to ensure that the
file names inside dbmirror v2 packets didn't include the "dbmirror2."
prefix.  It wouldn't cause any issue if they did, but it was preferred
to match how dbmirror v1 packets are dumped (whose tables are in the
musicbrainz schema, so are not schema-qualified).  I moved the
re-mapping from dump_table to ExportAllTables where it's needed.
@mwiencek mwiencek merged commit b753651 into metabrainz:production May 25, 2022
@mwiencek mwiencek deleted the mbs-12400 branch May 25, 2022 19:51
reosarevok added a commit that referenced this pull request Jun 7, 2022
* production:
  Update POT files using the production database
  Update translations from Transifex
  MBS-12437: Limit overflow-wrap even more (#2561)
  MBS-12400: fix non-musicbrainz-schema dumps (#2541)
  Sync incremental JSON dumps to trille
reosarevok added a commit that referenced this pull request Jun 13, 2022
* master:
  Update POT files using the production database
  Update translations from Transifex
  MBS-12395: Report for videos in mediums that shouldn't support video (#2562)
  MBS-12356: Correctly select + clean up Tidal store pages (#2515)
  Add comment to ensure test data is kept
  Save AC redirects before swapping AC uses
  Document and standardize Controller::Aliases tests
  Document and standardize Controller::EditAlias tests
  Document and standardize Controller::DeleteAlias tests
  Document and standardize Controller::AddAlias tests
  MBS-9188: Improve LinkedIn URL cleanup (#2553)
  MBS-12419: Block Genius.com links at release level (#2560)
  MBS-12417: Update Soundcloud cleanup to remove ? parameters (#2552)
  Avoid declaring my $tx twice in one test
  MBS-12351: Also trim space only disambiguations (#2510)
  MBS-12376: Don't show spammers on area pages (#2554)
  MBS-12447: Also show area-series rels on series page (#2563)
  MBS-12393: Change TOWER RECORDS to all-caps as per store Japanese usage (#2558)
  Simplify the allowed hosts shortener list
  MBS-12383: Block smart links: bfan.link
  MBS-12396: Block smart links: hyperfollow.com
  MBS-12401: Block smart links: hypeddit.com
  MBS-12350: Block smart links: bio.link
  MBS-12352: Block smart links: streamerlinks.com
  Add basic tests for the artist credit page
  Add IDs to sections of ArtistCreditIndex
  MBS-12354: Check if AC IDs are valid before passing to the DB
  MBS-12312: Convert edit.tt to React
  MBS-12312: Convert history.tt to React
  MBS-12312: Convert diff.tt to React
  MBS-12312: Convert revision.tt to React
  MBS-12312: Remove unused summary.tt
  MBS-12400: fix non-musicbrainz-schema dumps (#2541)
  Sync incremental JSON dumps to trille
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants