Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration PostgreSQL to MariaDB: Issues with DirectBlob feature #1149

Closed
StefanRother-OTOBO opened this issue Jul 30, 2021 · 10 comments
Closed
Assignees
Labels
bug Something isn't working as intended
Milestone

Comments

@StefanRother-OTOBO
Copy link
Contributor

Hi,

In the file Kernel/System/MigrateFromOTRS/CloneDB/Driver/postgresql.pm we check the binary blob. For PostgreSQL TEXT is specified here. I just migrated a system and there was text no binary? According to my research this is also the normal case. After I then took out the function, the data was fine after the migration.

When you look at this, can you please add another check that checks if there are values with NULL and in OTOBO this is not allowed? This has happened to me several times now with PostgreSQL and it is tedious when then the migration stops completely.

Thanks a lot!

Stefan

Get all binary columns and return a lookup hash with table and column name as keys.

sub BlobColumnsList {
my ( $Self, %Param ) = @_;

# check needed stuff
for my $Needed (qw(DBObject DBName Table)) {
    if ( !$Param{$Needed} ) { 
        $Kernel::OM->Get('Kernel::System::Log')->Log(
            Priority => 'error',
            Message  => "Need $Needed!",
        );  

        return;
    }   
}   

$Param{DBObject}->Prepare(
    SQL => <<'END_SQL',

SELECT COLUMN_NAME, DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_CATALOG = ?
AND TABLE_NAME = ?
AND DATA_TYPE = 'Text';
END_SQL
Bind => [ $Param{DBName}, $Param{Table} ],
) || return {};

my %Result;
while ( my ( $Column, $Type ) = $Param{DBObject}->FetchrowArray() ) { 
    $Result{$Column} = $Type;
}   

return \%Result;

}

@bschmalhofer
Copy link
Contributor

bschmalhofer commented Aug 2, 2021

This is fairly confusing. The type LONGBLOB in scripts/database/otobo-schema.xml is translated into longblob under MySQL and in text under PostgreSQL. There are at least three different cases:

  • The column holds UTF-8 with wide chars, e.g. YAML serialized data structures
  • The column holds binary data, that is base64 encoded data under PostgreSQL
  • The column holds base 65 encoded data both under MySQL und PostgreSQL

My hunch is that the second case needs to be handled differently than the other cases.

This is all related to the funny behavior of Kernel::System::DB::FetchrowArray that sets the UTF8-flag on whatever data it gets. Does not make much sense on numeric data and is simply wrong when the string data is not UTF-8. The latter case is rectified in the relevant modules, but still confusing.

@bschmalhofer
Copy link
Contributor

New findings:
In the initial commit of Kernel/System/MigrateFromOTRS/CloneDB/Backend.pm there was a commented out list of columns that required special handling:


    # OTRS stores binary data in some columns. On some database systems,
    #   these are handled differently (data is converted to base64-encoding before
    #   it is stored. Here is the list of these columns which need special treatment.

    # Create function BlobColumnsList to get this info.
    # TODO: Remove after testing
    #my %BlobColumns;
    #$BlobColumns{"article_data_mime_plain.body"} = 1;
    #$BlobColumns{"article_data_mime_attachment.content"} = 1;
    #$BlobColumns{"virtual_fs_db.content"} = 1;
    #$BlobColumns{"web_upload_cache.content"} = 1;
    #$BlobColumns{"standard_attachment.content"} = 1;
    #$BlobColumns{"faq_attachment.content"} = 1;
    #$BlobColumns{"change_template.content"} = 1;
    #$BlobColumns{"mail_queue.raw_message"} = 1;

This list was later removed.

I gather that such a list, containing all cases relevante for DirectBlob, is still necessary. For lack of a better idea let's pult the list into Kernel::System::MigrateFromOTRS::Base::DBDirectBlobColumns.

@bschmalhofer
Copy link
Contributor

See also: #601, #1006, and #800.

bschmalhofer added a commit that referenced this issue Aug 3, 2021
that are relevant to the DirectBlob feature.
bschmalhofer added a commit that referenced this issue Aug 3, 2021
Only specific columns are relevant for DirectBlob.
bschmalhofer added a commit that referenced this issue Aug 3, 2021
The trick for avoiding dropping and recreating the sequence
was not obvious.
bschmalhofer added a commit that referenced this issue Aug 3, 2021
@bschmalhofer
Copy link
Contributor

During testing I found out that PerlOption +Parent, #310, only works per VirtualHost.
I added code comments in the Apache config files, reminding people about this nuisance.

@bschmalhofer
Copy link
Contributor

Another issue. When migrating from OTRS(Pg) to OTOBO(MySQL) I see many of the following message in the logfile:

[Fri Aug 6 16:53:36 2021] -e: Use of uninitialized value in numeric gt (>) at /opt/otobo/Kernel/System/MigrateFromOTRS/CloneDB/Driver/Base.pm line 505.

This is likely nothing to worry about, as the migrated database seems to work fine. But let's investigate. The line in question is:

            next SOURCE_COLUMN unless $SourceColumnInfos->{LENGTH} > $TargetColumnInfos->{LENGTH};

Dumping the infos about $SourceColumnInfos and $TargetColumnInfos we get:


[Fri Aug  6 16:53:36 2021] -e: $VAR1 = {
[Fri Aug  6 16:53:36 2021] -e:           'IS_NULLABLE' => 'YES',
[Fri Aug  6 16:53:36 2021] -e:           'COLUMN' => 'a_from',
[Fri Aug  6 16:53:36 2021] -e:           'DATA_TYPE' => 'character varying',
[Fri Aug  6 16:53:36 2021] -e:           'LENGTH' => undef
[Fri Aug  6 16:53:36 2021] -e:         };
[Fri Aug  6 16:53:36 2021] -e: $VAR2 = {
[Fri Aug  6 16:53:36 2021] -e:           'LENGTH' => 16777215,
[Fri Aug  6 16:53:36 2021] -e:           'DATA_TYPE' => 'mediumtext',
[Fri Aug  6 16:53:36 2021] -e:           'COLUMN' => 'a_from',
[Fri Aug  6 16:53:36 2021] -e:           'IS_NULLABLE' => 'YES'
[Fri Aug  6 16:53:36 2021] -e:         };

This makes sense. The source is PostgreSQL with the datatype 'character varying' and the target is MySQL with the data type 'mediumtext'. The length on the MySQL side checks out. A mediumtext holds 16777215 bytes, about 16MB. But why is there no length on the PostgreSQL side?

The declaration of article_data_mime.a_from is in otobo-schema.xml:

Checking the functions _TypeTranslation() in mysql.pm and in postgresql.pm we get confirmation that the size 1_800_000 becomes MEDIUMTEXT on MySQL and VARCHAR on PostgreSQL. It just so happens that MEDIUMTEXT (MySQL) has a size restritction while VARCHAR (Pg) has no restriction.

MySQL:

    if ( $Tag->{Type} =~ /^VARCHAR$/i ) {
        if ( $Tag->{Size} > 16777215 ) {
            $Tag->{Type} = 'LONGTEXT';
        }
        elsif ( $Tag->{Size} > 65535 ) {
            $Tag->{Type} = 'MEDIUMTEXT';
        }
        elsif ( $Tag->{Size} > 255 ) {
            $Tag->{Type} = 'TEXT';
        }
        else {
            $Tag->{Type} = 'VARCHAR (' . $Tag->{Size} . ')';
        }
    }

PostgreSQL:

    elsif ( $Tag->{Type} =~ /^VARCHAR$/i ) {
        $Tag->{Type} = 'VARCHAR (' . $Tag->{Size} . ')';
        if ( $Tag->{Size} >= 10000 ) {
            $Tag->{Type} = 'VARCHAR';
        }
    }

This means that we have an interesting mix. There are length differences that happen basically by accident, because of differences in data types. On the other hand, there are columns that actually should be shortened.

I tend to take the most simple approach. Treat an undefined length as infinite. Another question is how this plays with encodings. The LENGTH appears to be in bytes but the shortening is done by calling SUBSTRING(). How can this possibly work?

@bschmalhofer bschmalhofer changed the title Migration Issue PostgreSQL to MariaDB Migration PostgreSQL to MariaDB: Issues with DirectBlob feature Aug 6, 2021
@bschmalhofer
Copy link
Contributor

bschmalhofer commented Aug 7, 2021

After merging #1160 I did a test migration from OTRS (Pg) to OTOBO (MySQL). The source database was:

  • populated with Dev::Tools::Database::RandomDataInsert
  • an ACL was added
  • the FAQ package was installed
  • an article with an attachment was added
  • the user profile was changed

Some notes on the relevant datatypes. In PostgreSQL the data type text is used. The size is practically unlimited, multi-byte characters, in our case UTF-8. In MySQL we have the data type longblob. The size is practically unlimited, binary.

Here are the findings from looking at source and target database. The tests should cover the relevant cases.

  • acl.config_match and acl.config_change OK no Base64 encoding on both sides
  • package_repository.content OK Base64 encoding in PG, no Base64 encoding in MySQL. table migration skipped
  • article_data_mime_attachment OK Base64 encoding in PG, no Base64 encoding in MySQL. table migrated

The output of the SQL statements used for testing is attached. ( With an incorrect year in the filename.)
report_testmigration_20200806a.txt

@bschmalhofer
Copy link
Contributor

bschmalhofer commented Aug 7, 2021

TODO:

  • Testmigration OTRS Pg to OTOBO MySQL
  • Testmigration OTRS Pg to OTOBO Pg

@bschmalhofer
Copy link
Contributor

Yet another finding: The user otobo cannot set the session_replication_role. She needs superuser privs for that. See https://otobo.de/de/forums/topic/otrs-6-mysql-migration-to-otobo-postgresql/.

@bschmalhofer
Copy link
Contributor

After giving the otobo user superadmin privileges, the migration from OTRS (pg) to OTOBO (Pg) executed find. Did the same checks as for the previous testmigration. The result look fine. See the attached file for the results of the SQL statements used for testing.
report_testmigration_20210809a.txt

bschmalhofer added a commit that referenced this issue Aug 9, 2021
bschmalhofer added a commit that referenced this issue Aug 9, 2021
bschmalhofer added a commit to RotherOSS/doc-otobo-installation that referenced this issue Aug 10, 2021
bschmalhofer added a commit to RotherOSS/doc-otobo-installation that referenced this issue Aug 10, 2021
bschmalhofer added a commit to RotherOSS/doc-otobo-installation that referenced this issue Aug 10, 2021
bschmalhofer added a commit to RotherOSS/doc-otobo-installation that referenced this issue Aug 10, 2021
The DB user otobo might need superuser privs.
bschmalhofer added a commit to RotherOSS/doc-otobo-installation that referenced this issue Aug 10, 2021
@bschmalhofer
Copy link
Contributor

The tests from last week looked fine. Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as intended
Projects
None yet
Development

No branches or pull requests

2 participants