Migration PostgreSQL to MariaDB: Shortening Columns #1166

bschmalhofer · 2021-08-06T16:41:57Z

Another issue. When migrating from OTRS(Pg) to OTOBO(MySQL) I see many of the following message in the logfile:

[Fri Aug 6 16:53:36 2021] -e: Use of uninitialized value in numeric gt (>) at /opt/otobo/Kernel/System/MigrateFromOTRS/CloneDB/Driver/Base.pm line 505.

This is likely nothing to worry about, as the migrated database seems to work fine. But let's investigate. The line in question is:

            next SOURCE_COLUMN unless $SourceColumnInfos->{LENGTH} > $TargetColumnInfos->{LENGTH};

Dumping the infos about $SourceColumnInfos and $TargetColumnInfos we get:


[Fri Aug  6 16:53:36 2021] -e: $VAR1 = {
[Fri Aug  6 16:53:36 2021] -e:           'IS_NULLABLE' => 'YES',
[Fri Aug  6 16:53:36 2021] -e:           'COLUMN' => 'a_from',
[Fri Aug  6 16:53:36 2021] -e:           'DATA_TYPE' => 'character varying',
[Fri Aug  6 16:53:36 2021] -e:           'LENGTH' => undef
[Fri Aug  6 16:53:36 2021] -e:         };
[Fri Aug  6 16:53:36 2021] -e: $VAR2 = {
[Fri Aug  6 16:53:36 2021] -e:           'LENGTH' => 16777215,
[Fri Aug  6 16:53:36 2021] -e:           'DATA_TYPE' => 'mediumtext',
[Fri Aug  6 16:53:36 2021] -e:           'COLUMN' => 'a_from',
[Fri Aug  6 16:53:36 2021] -e:           'IS_NULLABLE' => 'YES'
[Fri Aug  6 16:53:36 2021] -e:         };

This makes sense. The source is PostgreSQL with the datatype 'character varying' and the target is MySQL with the data type 'mediumtext'. The length on the MySQL side checks out. A mediumtext holds 16777215 bytes, about 16MB. But why is there no length on the PostgreSQL side?

The declaration of article_data_mime.a_from is in otobo-schema.xml:

Checking the functions _TypeTranslation() in mysql.pm and in postgresql.pm we get confirmation that the size 1_800_000 becomes MEDIUMTEXT on MySQL and VARCHAR on PostgreSQL. It just so happens that MEDIUMTEXT (MySQL) has a size restritction while VARCHAR (Pg) has no restriction.

MySQL:

    if ( $Tag->{Type} =~ /^VARCHAR$/i ) {
        if ( $Tag->{Size} > 16777215 ) {
            $Tag->{Type} = 'LONGTEXT';
        }
        elsif ( $Tag->{Size} > 65535 ) {
            $Tag->{Type} = 'MEDIUMTEXT';
        }
        elsif ( $Tag->{Size} > 255 ) {
            $Tag->{Type} = 'TEXT';
        }
        else {
            $Tag->{Type} = 'VARCHAR (' . $Tag->{Size} . ')';
        }
    }

PostgreSQL:

    elsif ( $Tag->{Type} =~ /^VARCHAR$/i ) {
        $Tag->{Type} = 'VARCHAR (' . $Tag->{Size} . ')';
        if ( $Tag->{Size} >= 10000 ) {
            $Tag->{Type} = 'VARCHAR';
        }
    }

This means that we have an interesting mix. There are length differences that happen basically by accident, because of differences in data types. On the other hand, there are columns that actually should be shortened.

I tend to take the most simple approach. Treat an undefined length as infinite. Another question is how this plays with encodings. The LENGTH appears to be in bytes but the shortening is done by calling SUBSTRING(). How can this possibly work?

Originally posted by @bschmalhofer in #1149 (comment)

The text was updated successfully, but these errors were encountered:

bschmalhofer · 2021-08-06T16:47:38Z

See also #542 and #836

bschmalhofer · 2021-08-06T19:52:40Z

_TypeTranslation() in mysql.pm is in principle broken. When e.g. VARCHAR(60_000) is declared, then it would be mapped into a TEXT. But 60_000 4-Byte characters do not fit into a TEXT field. In practice it's not really relevant, the largest TEXT fleld requires bytes that accomodate 10_000 characters. For the longest MEDIUMTEXT field its 1_800_000 characters. There is no conflict.

For postgresql.pm everything is fine. All columns declared as VARCHAR in otobo-schema.xml are either mapped into the appropriate VARCHAR(n) type or mapped into VARCHAR. And VARCHAR has no size restriction in PostgreSQL.

So, I propose to:

fix _TypeTranslation() in Kernel/System/DB/mysql,pm , causing no changes when the present otobo-schema.xml is used
Do not shorted unless the types in source and target are both VARCHAR, or 'character varying' with a size up 255 characters.

The latter bullet point is more pragmatic than correct.

bschmalhofer · 2021-08-10T12:32:19Z

Moved the adaption of TypeTranslation() to a separate issue. This change is not trivial and thus should not be done in a patchlevel release.

from the code comments.

about which columns are shortened.

Issue #1166 shorten columns

bschmalhofer · 2021-08-10T14:10:28Z

Adapted the check for when columns are shortened. The warnings in the error log are gone now. The messages looked sane. Did a testmigration from OTRS (Pg) to OTBOO (MySQL). Some sanity checks on the migrated database looked find.

bschmalhofer · 2021-08-19T14:57:05Z

Let's close this issue, as the test last week looked fine.

bschmalhofer mentioned this issue Aug 10, 2021

Saner TypeTranslation() in Kernel/System/DB/mysql.pm #1169

Open

bschmalhofer self-assigned this Aug 10, 2021

bschmalhofer added this to the OTOBO 10.0.12 milestone Aug 10, 2021

bschmalhofer mentioned this issue Aug 10, 2021

ResetAutoIncrementField() was likely never called #1161

Closed

bschmalhofer added a commit that referenced this issue Aug 10, 2021

Issue #1166: remove remnants of batch insert

f33dbc1

from the code comments.

bschmalhofer added a commit that referenced this issue Aug 10, 2021

Issue #1166: stricter checks for when to shorten to 190 chars

48ce62c

bschmalhofer added a commit that referenced this issue Aug 10, 2021

Issue #1166: Mention to table in message

794080d

about which columns are shortened.

bschmalhofer added a commit that referenced this issue Aug 10, 2021

Merge pull request #1171 from RotherOSS/issue-#1166-shorten_columns

b1762a1

Issue #1166 shorten columns

bschmalhofer closed this as completed Aug 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migration PostgreSQL to MariaDB: Shortening Columns #1166

Migration PostgreSQL to MariaDB: Shortening Columns #1166

bschmalhofer commented Aug 6, 2021

bschmalhofer commented Aug 6, 2021

bschmalhofer commented Aug 6, 2021

bschmalhofer commented Aug 10, 2021

bschmalhofer commented Aug 10, 2021

bschmalhofer commented Aug 19, 2021

Migration PostgreSQL to MariaDB: Shortening Columns #1166

Migration PostgreSQL to MariaDB: Shortening Columns #1166

Comments

bschmalhofer commented Aug 6, 2021

bschmalhofer commented Aug 6, 2021

bschmalhofer commented Aug 6, 2021

bschmalhofer commented Aug 10, 2021

bschmalhofer commented Aug 10, 2021

bschmalhofer commented Aug 19, 2021