Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review the option Type => 'binary' in Kernel::System::Main::Dump() #694

Closed
2 tasks
bschmalhofer opened this issue Dec 30, 2020 · 5 comments
Closed
2 tasks
Labels
enhancement New feature or request
Milestone

Comments

@bschmalhofer
Copy link
Contributor

It's not obvious which problem the option 'binary' is solving. My guess is that is should allow copy@paste without the quoted wide characters.

The referenced bug https://rt.cpan.org/Public/Bug/Display.html?id=28607 has been rejected because Data::Dumper does what it should be doing.

My gut feeling is to remove the option 'binary' and make 'ascii' the default. This would also removed the need for the internal sub _Dump().

TODO:

  • investigate whether 'binary' is sensibel
  • remove it, when it is not sensible
@bschmalhofer bschmalhofer added the enhancement New feature or request label Dec 30, 2020
@bschmalhofer bschmalhofer added this to the OTOBO 10.1 milestone Dec 30, 2020
@wollmers
Copy link
Contributor

Data::Dumper does not roundtrip:

helmut@mbp:~$ perl -e 'use utf8; binmode STDOUT, ":utf8"; use Data::Dumper; $s="fo\x{F6}"; print Dumper($s),"\n";'
$VAR1 = 'foö';

helmut@mbp:~$ perl -e 'use utf8; binmode STDOUT, ":utf8"; use Data::Dumper; $s="foö"; print Dumper($s),"\n";'
$VAR1 = "fo\x{f6}";

The claim in the POD of Kernel::System::Main::Dump() is either misleading, or confusing ascii (7-bit) versus binary (raw).

This maybe is not relevant, if the same encoding is used for encode and decode.

Usage of ascii:

otobo$ grep -Ri -C10 ">Dump(" Kernel/ | grep ascii
Kernel//System/Main.pm-    dump only in ascii characters (> 128 will be marked as \x{..})
Kernel//System/Main.pm-    dump only in ascii characters (> 128 will be marked as \x{..})
Kernel//System/Main.pm-        'ascii', # ascii|binary - default is binary
Kernel//System/DynamicField/Driver/WebService.pm:    push @MD5Strings, $MainObject->Dump( \%Result, 'ascii' );
Kernel//System/Console/Command/Maint/CloudServices/ConnectionCheck.pm-        'ascii',

@bschmalhofer
Copy link
Contributor Author

bschmalhofer commented Dec 30, 2020

Yes, that's more or less my question. I see nothing wrong with $VAR1 = "fo\x{f6}";. The reason for jumping thru hoops for getting $VAR1 = 'foö'; escapes me. The second version has the disadvantage of giving different results, depending on whether use utf8 is in effect or not.

Correction: as @wollmers pointed out both versions give the same string. 'ö' is in the high bit range of iso-8859-1. 'ö' is represented both in iso-98895-1 and utf8 by the byte F6.

Correction of the correction: ö has the Unicode code point F6 and the representation F6 in iso-8895-1. In UTF-8 an ö is represented as two bytes:

 uni -8 ö
ö - U+000F6 - C3 B6 - LATIN SMALL LETTER O WITH DIAERESIS

@wollmers
Copy link
Contributor

For me the behaviour of Data::Dumper looks "random". Both $s="fo\x{F6}" and $s="foö"should interpolate to the same value, and both under use utf8;.

$ perl -e 'use utf8; binmode STDOUT, ":utf8";print "fo\x{F6} eq foö\n" if ("fo\x{F6}" eq "foö")'
foö eq foö

Personally I don't use Data::Dumper in serious software--too many bad experiences. During debugging it's ok.

@bschmalhofer bschmalhofer modified the milestones: OTOBO 10.1, Wishlist Mar 7, 2022
@bschmalhofer bschmalhofer changed the title Review the option 'binary' in Kernel::System::Main::Dump() Review the option Type => 'binary' in Kernel::System::Main::Dump() Mar 24, 2023
@bschmalhofer
Copy link
Contributor Author

Let's keep this on the wishlist. Changing the behavior of Dump() might lead to side effects. New developments however should use Kernel::System::YAML or pass Type = 'ascii'.

bschmalhofer added a commit that referenced this issue Mar 24, 2023
Try to explain the effects of the extra arguments 'binary' and 'ascii'.
bschmalhofer added a commit that referenced this issue Mar 24, 2023
@bschmalhofer
Copy link
Contributor Author

Closing, as the convention is that issues in the wishlist should be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants