Skip to content

Commit

Permalink
Various fixups for a better control of the wrapping
Browse files Browse the repository at this point in the history
Text
 * Honor the (existing) --neverwrap option  to handle every content verbatim.

Core:
 * Add a --wrap-po option to control how the po file is wrapped, and
   thus chose between nicely wrapped files that tend to produce git
   conflicts and ugly files that are easy to automatically deal with.
  • Loading branch information
mquinson committed Apr 1, 2020
1 parent 7b4a6f3 commit 3ae858f
Show file tree
Hide file tree
Showing 14 changed files with 395 additions and 60 deletions.
8 changes: 8 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,19 @@ Markdown:
* Avoid translating Markdown fenced code block info string (GitHub's #194)
* List Markdown fenced code block info string as text type (GitHub's #195)
* Support YAML Front Matter (GitHub's #196). This requires YAML::Tiny.

Text:
* Honor the (existing) --neverwrap option to handle every content verbatim.

po4a tool:
* Pass --add-location=file to msgmerge when receiving option porefs.
(requires gettext >= 0.19 -- June 2014)

Core:
* Add a --wrap-po option to control how the po file is wrapped, and
thus chose between nicely wrapped files that tend to produce git
conflicts and ugly files that are easy to automatically deal with.

Documentation:
* Various cleanups by Golubev Alexander (GitHub's #190 & #191)

Expand Down
10 changes: 4 additions & 6 deletions lib/Locale/Po4a/Common.pm
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,7 @@ Locale::Po4a::Common - common parts of the po4a scripts and utils
Locale::Po4a::Common contains common parts of the po4a scripts and some useful
functions used along the other modules.
In order to use Locale::Po4a programatically, one may want to disable
the use of Text::WrapI18N, by writing something like:
If needed, you can disable the use of Text::WrapI18N as such:
use Locale::Po4a::Common qw(nowrapi18n);
use Locale::Po4a::Text;
Expand All @@ -28,9 +27,8 @@ instead of:
use Locale::Po4a::Text;
Ordering is important here: as most Locale::Po4a modules themselves
load Locale::Po4a::Common, the first time this module is loaded
determines whether Text::WrapI18N is used.
The ordering is important here: as most Locale::Po4a modules load themselves
Locale::Po4a::Common, the first time this module is loaded determines whether Text::WrapI18N is used.
=cut

Expand Down Expand Up @@ -106,7 +104,7 @@ sub show_version {
print sprintf(gettext(
"%s version %s.\n".
"Written by Martin Quinson and Denis Barbier.\n\n".
"Copyright © 2002-2018 Software in the Public Interest, Inc.\n".
"Copyright © 2002-2020 Software in the Public Interest, Inc.\n".
"This is free software; see source code for copying\n".
"conditions. There is NO warranty; not even for\n".
"MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE."
Expand Down
74 changes: 49 additions & 25 deletions lib/Locale/Po4a/Po.pm
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,22 @@ B<msgmerge>). This option will become the default in a future release, because
it is more sensible. The B<nowrap> option is available so that users who want
to keep the old behavior can do so.
=item B<--wrap-po> B<no>|B<newlines>|I<number> (default: 76)
Specify how the po file should be wrapped. This gives the choice between files
that are nicely wrapped but could lead to git conflicts, and files that are
easier to handle automatically, but harder to read for humans.
Historically, the gettext suite has reformatted the po files at the 77th column
for cosmetics. This option specifies the behavior of po4a. If set to a numerical
value, po4a will wrap the po file after this column and after newlines in the
content. If set to B<newlines>, po4a will only split the msgid and msgstr after
newlines in the content. If set to B<no>, po4a will not wrap the po file at all.
The wrapping of the reference comments is controlled by the B<--porefs> option.
Note that this option has no impact on how the msgid and msgstr are wrapped, ie
on how newlines are added to the content of these strings.
=item B<--msgid-bugs-address> I<email@address>
Set the report address for msgid bugs. By default, the created POT files
Expand Down Expand Up @@ -174,26 +190,31 @@ sub initialize {
my $time = time;
my $date = strftime("%Y-%m-%d %H:%M", localtime($time)) . timezone($time);
chomp $date;
# $options = ref($options) || $options;

$self->{options}{'porefs'}= 'full,nowrap';
$self->{options}{'msgid-bugs-address'}= undef;
$self->{options}{'copyright-holder'}= "Free Software Foundation, Inc.";
$self->{options}{'package-name'}= "PACKAGE";
$self->{options}{'package-version'}= "VERSION";
$self->{options}{'wrap-po'} = 76;
foreach my $opt (keys %$options) {
# print STDERR "$opt: ".(defined($options->{$opt})?$options->{$opt}:"(undef)")."\n";
if ($options->{$opt}) {
die wrap_mod("po4a::po",
dgettext ("po4a", "Unknown option: %s"), $opt)
die wrap_mod("po4a::po", dgettext ("po4a", "Unknown option: %s"), $opt)
unless exists $self->{options}{$opt};
$self->{options}{$opt} = $options->{$opt};
}
}
$self->{options}{'wrap-po'} =~ /^(no|newlines|\d+)$/ ||
die wrap_mod("po4a::po",
dgettext ("po4a", "Invalid value for option 'wrap-po' ('%s' is not 'no' nor 'newlines' nor a number)"),
$self->{options}{'wrap-po'});

$self->{options}{'porefs'} =~ /^(full|counter|noline|file|none|never)(,(no)?wrap)?$/ ||
die wrap_mod("po4a::po",
dgettext ("po4a",
"Invalid value for option 'porefs' ('%s' is ".
"not one of 'full', 'counter', 'file' or 'never')"),
"not one of 'full', 'counter', 'noline', 'file' or 'never' + eventually ',wrap' or ',nowrap')"),
$self->{options}{'porefs'});
$self->{options}{'porefs'} =~ s/noline/file/; # backward compat. 'file' used to be called 'noline'.
$self->{options}{'porefs'} =~ s/none/never/; # backward compat. 'never' used to be called 'none'.
Expand Down Expand Up @@ -440,7 +461,7 @@ sub write{
if defined($self->{header_comment}) && length($self->{header_comment});

print $fh "msgid \"\"\n";
print $fh "msgstr ".quote_text($self->{header})."\n\n";
print $fh "msgstr ".quote_text($self->{header}, $self->{options}{'wrap-po'})."\n\n";


my $buf_msgstr_plural; # Used to keep the first msgstr of plural forms
Expand Down Expand Up @@ -490,25 +511,25 @@ sub write{
if ($self->get_charset =~ /^utf-8$/i) {
my $msgstr = Encode::decode_utf8($self->{po}{$msgid}{'msgstr'});
$msgid = Encode::decode_utf8($msgid);
$output .= Encode::encode_utf8("msgid ".quote_text($msgid)."\n");
$buf_msgstr_plural = Encode::encode_utf8("msgstr[0] ".quote_text($msgstr)."\n");
$output .= Encode::encode_utf8("msgid ".quote_text($msgid, $self->{options}{'wrap-po'})."\n");
$buf_msgstr_plural = Encode::encode_utf8("msgstr[0] ".quote_text($msgstr, $self->{options}{'wrap-po'})."\n");
} else {
$output = "msgid ".quote_text($msgid)."\n";
$buf_msgstr_plural = "msgstr[0] ".quote_text($self->{po}{$msgid}{'msgstr'})."\n";
$output = "msgid ".quote_text($msgid, $self->{options}{'wrap-po'})."\n";
$buf_msgstr_plural = "msgstr[0] ".quote_text($self->{po}{$msgid}{'msgstr'}, $self->{options}{'wrap-po'})."\n";
}
} elsif ($self->{po}{$msgid}{'plural'} == 1) {
# TODO: there may be only one plural form
if ($self->get_charset =~ /^utf-8$/i) {
my $msgstr = Encode::decode_utf8($self->{po}{$msgid}{'msgstr'});
$msgid = Encode::decode_utf8($msgid);
$output = Encode::encode_utf8("msgid_plural ".quote_text($msgid)."\n");
$output = Encode::encode_utf8("msgid_plural ".quote_text($msgid, $self->{options}{'wrap-po'})."\n");
$output .= $buf_msgstr_plural;
$output .= Encode::encode_utf8("msgstr[1] ".quote_text($msgstr)."\n");
$output .= Encode::encode_utf8("msgstr[1] ".quote_text($msgstr, $self->{options}{'wrap-po'})."\n");
$buf_msgstr_plural = "";
} else {
$output = "msgid_plural ".quote_text($msgid)."\n";
$output = "msgid_plural ".quote_text($msgid, $self->{options}{'wrap-po'})."\n";
$output .= $buf_msgstr_plural;
$output .= "msgstr[1] ".quote_text($self->{po}{$msgid}{'msgstr'})."\n";
$output .= "msgstr[1] ".quote_text($self->{po}{$msgid}{'msgstr'}, $self->{options}{'wrap-po'})."\n";
}
} else {
die wrap_msg(dgettext("po4a","Can't write PO files with more than two plural forms."));
Expand All @@ -517,11 +538,11 @@ sub write{
if ($self->get_charset =~ /^utf-8$/i) {
my $msgstr = Encode::decode_utf8($self->{po}{$msgid}{'msgstr'});
$msgid = Encode::decode_utf8($msgid);
$output .= Encode::encode_utf8("msgid ".quote_text($msgid)."\n");
$output .= Encode::encode_utf8("msgstr ".quote_text($msgstr)."\n");
$output .= Encode::encode_utf8("msgid ".quote_text($msgid, $self->{options}{'wrap-po'})."\n");
$output .= Encode::encode_utf8("msgstr ".quote_text($msgstr, $self->{options}{'wrap-po'})."\n");
} else {
$output .= "msgid ".quote_text($msgid)."\n";
$output .= "msgstr ".quote_text($self->{po}{$msgid}{'msgstr'})."\n";
$output .= "msgid ".quote_text($msgid, $self->{options}{'wrap-po'})."\n";
$output .= "msgstr ".quote_text($self->{po}{$msgid}{'msgstr'}, $self->{options}{'wrap-po'})."\n";
}
}

Expand Down Expand Up @@ -1295,13 +1316,13 @@ sub push_raw {
if ( defined $msgstr
and defined $self->{po}{$msgid}{'msgstr'}
and $self->{po}{$msgid}{'msgstr'} ne $msgstr) {
my $txt=quote_text($msgid);
my $txt=quote_text($msgid, $self->{options}{'wrap-po'});
my ($first,$second)=
(format_comment(". ",$self->{po}{$msgid}{'reference'}).
quote_text($self->{po}{$msgid}{'msgstr'}),
quote_text($self->{po}{$msgid}{'msgstr'}, $self->{options}{'wrap-po'}),

format_comment(". ",$reference).
quote_text($msgstr));
quote_text($msgstr), $self->{options}{'wrap-po'});

if ($keep_conflict) {
if ($self->{po}{$msgid}{'msgstr'} =~ m/^#-#-#-#-# .* #-#-#-#-#\\n/s) {
Expand Down Expand Up @@ -1572,14 +1593,17 @@ sub escape_text {
}

# put quotes around the string on each lines (without escaping it)
# It does also normalize the text (ie, make sure its representation is wraped
# It does also normalize the text (ie, make sure its representation is wrapped
# on the 80th char, but without changing the meaning of the string)
sub quote_text {
my $string = shift;
my $do_wrap = shift; # either 'no' or 'newlines', or column at which we should wrap

return '""' unless defined($string) && length($string);

print STDERR "\nquote [$string]====" if $debug{'quote'};
return "\"$string\"" if ($do_wrap eq 'no');

print STDERR "\nquote $do_wrap [$string]====" if $debug{'quote'};
# break lines on newlines, if any
# see unescape_text for an explanation on \G
$string =~ s/( # $1:
Expand All @@ -1588,7 +1612,8 @@ sub quote_text {
(\\\\)* # followed by any even number of '\'
\\n) # and followed by an escaped newline
/$1\n/sgx; # single string, match globally, allow comments
$string = wrap($string);

$string = wrap($string, $do_wrap) if ($do_wrap ne 'newlines');
my @string = split(/\n/,$string);
$string = join ("\"\n\"",@string);
$string = "\"$string\"";
Expand Down Expand Up @@ -1633,8 +1658,7 @@ sub canonize {
return $text;
}

# wraps the string. We don't use Text::Wrap since it mangles whitespace at
# the end of splited line
# wraps the string. We don't use Text::Wrap since it mangles whitespace at the end of the split line
sub wrap {
my $text=shift;
return "0" if ($text eq '0');
Expand Down
38 changes: 20 additions & 18 deletions lib/Locale/Po4a/Text.pm
Original file line number Diff line number Diff line change
Expand Up @@ -156,11 +156,11 @@ my %control = ();

=item B<neverwrap>
Prevent po4a from wrapping lines.
Prevent po4a from wrapping any lines. This means that every content is handled verbatim, even simple paragraphs.
=cut

my $neverwrap = 0;
my $defaultwrap = 1;

my $parse_func = \&parse_fallback;

Expand Down Expand Up @@ -211,7 +211,7 @@ sub initialize {
}

if (defined $options{'neverwrap'}) {
$neverwrap = 1;
$defaultwrap = 0;
}

if (defined $options{'debianchangelog'}) {
Expand Down Expand Up @@ -247,14 +247,14 @@ sub parse_fallback {
# Break paragraphs on lines containing only spaces
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph="";
$wrapped_mode = 1 unless defined($self->{verbatim});
$wrapped_mode = $defaultwrap unless defined($self->{verbatim});
$self->pushline($line."\n");
undef $self->{controlkey};
} elsif ($line =~ /^-- $/) {
# Break paragraphs on email signature hint
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph="";
$wrapped_mode = 1;
$wrapped_mode = $defaultwrap;
$self->pushline($line."\n");
} elsif ( $line =~ /^=+$/
or $line =~ /^_+$/
Expand All @@ -263,7 +263,7 @@ sub parse_fallback {
$paragraph .= $line."\n";
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph="";
$wrapped_mode = 1;
$wrapped_mode = $defaultwrap;
} elsif ($tabs eq "split" and $line =~ m/\t/ and $paragraph !~ m/\t/s) {
$wrapped_mode = 0;
do_paragraph($self,$paragraph,$wrapped_mode);
Expand All @@ -272,7 +272,7 @@ sub parse_fallback {
} elsif ($tabs eq "split" and $line !~ m/\t/ and $paragraph =~ m/\t/s) {
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph = "$line\n";
$wrapped_mode = 1;
$wrapped_mode = $defaultwrap;
} else {
if ($line =~ /^\s/) {
# A line starting by a space indicates a non-wrap
Expand Down Expand Up @@ -375,7 +375,7 @@ sub parse_control {
}
$self->pushline("$tag: $t\n");
$paragraph="";
$wrapped_mode = 1;
$wrapped_mode = $defaultwrap;
$self->{bullet} = "";
$self->{indent} = " ";
} elsif ($line eq " .") {
Expand Down Expand Up @@ -445,7 +445,7 @@ sub parse_markdown_bibliographic_information {
($nextline, $nextref) = $self->shiftline();
}
# Now the title should be complete, give it to translation.
my $t = $self->translate($title, $ref, "Pandoc title block", "wrap" => 1);
my $t = $self->translate($title, $ref, "Pandoc title block", "wrap" => $defaultwrap);
$t = Locale::Po4a::Po::wrap($t);
my $first_line = 1;
foreach my $translated_line (split /\n/, $t) {
Expand Down Expand Up @@ -699,7 +699,7 @@ sub parse_markdown {
# Add the newline again for the output
$self->pushline($t . "\n");
$paragraph="";
$wrapped_mode = 1;
$wrapped_mode = $defaultwrap;
$self->pushline(($level x length($t))."\n");
} elsif ($line =~ m/^(#{1,6})( +)(.*?)( +\1)?$/) {
my $titlelevel1 = $1;
Expand All @@ -715,7 +715,7 @@ sub parse_markdown {
"Title $titlelevel1",
"wrap" => 0);
$self->pushline($titlelevel1.$titlespaces.$t.$titlelevel2."\n");
$wrapped_mode = 1;
$wrapped_mode = $defaultwrap;
} elsif ($line =~ /^[ ]{0,3}([*_-])\s*(?:\1\s*){2,}$/) {
# Horizontal rule
do_paragraph($self,$paragraph,$wrapped_mode);
Expand Down Expand Up @@ -750,7 +750,7 @@ sub parse_markdown {
# Avoid translating Markdown lines containing only markup
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph="";
$wrapped_mode = 1;
$wrapped_mode = $defaultwrap;
$self->pushline("$line\n");
} elsif ($line =~ /^\s*\[\[\!\S[^\]]*\]\]\s*$/) { # sole macro
# Preserve some Markdown markup as a single line
Expand All @@ -762,7 +762,7 @@ sub parse_markdown {
# Markdown markup needing separation _before_ this line
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph="$line\n";
$wrapped_mode = 1;
$wrapped_mode = $defaultwrap;
} else {
return parse_fallback($self,$line,$ref,$paragraph,$wrapped_mode,$expect_header,$end_of_paragraph);
}
Expand All @@ -773,7 +773,7 @@ sub parse {
my $self = shift;
my ($line,$ref);
my $paragraph="";
my $wrapped_mode = 1;
my $wrapped_mode = $defaultwrap;
my $expect_header = 1;
my $end_of_paragraph = 0;
($line,$ref)=$self->shiftline();
Expand All @@ -785,7 +785,7 @@ sub parse {
$file = $1;
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph="";
$wrapped_mode = 1;
$wrapped_mode = $defaultwrap;
$expect_header = 1;
}

Expand Down Expand Up @@ -818,7 +818,7 @@ sub parse {
if ($end_of_paragraph) {
do_paragraph($self,$paragraph,$wrapped_mode);
$paragraph="";
$wrapped_mode = 1;
$wrapped_mode = $defaultwrap;
$end_of_paragraph = 0;
}
($line,$ref)=$self->shiftline();
Expand All @@ -833,7 +833,7 @@ sub do_paragraph {
my $type = shift || $self->{type} || "Plain text";
return if ($paragraph eq "");

$wrap = 0 if $neverwrap;
$wrap = 0 unless $defaultwrap;

# DEBUG
# my $b;
Expand Down Expand Up @@ -873,7 +873,7 @@ TEST_BULLET:
my $trans = $self->translate($text,
$self->{ref},
"Bullet: '$indent1$bullet'",
"wrap" => 1,
"wrap" => $defaultwrap,
"wrapcol" => - (length $indent2));
$trans =~ s/^/$indent1$bullet/s;
$trans =~ s/\n(.)/\n$indent2$1/sg;
Expand Down Expand Up @@ -926,6 +926,8 @@ Tested successfully on simple text files and NEWS.Debian files.
Copyright © 2005-2008 Nicolas FRANÇOIS <nicolas.francois@centraliens.net>.
Copyright © 2008-2009, 2018 Jonas Smedegaard <dr@jones.dk>.
Copyright © 2020 Martin Quinson <mquinson#debian.org>.
This program is free software; you may redistribute it and/or modify it
under the terms of GPL (see the COPYING file).
$

This comment has been minimized.

Copy link
@Fat-Zer

Fat-Zer Apr 1, 2020

Contributor

Is this dollar sign intentional? shouldn't here be a =cut?

This comment has been minimized.

Copy link
@mquinson

mquinson Apr 2, 2020

Author Owner

That's indeed a typo... Fixed, now.

Loading

0 comments on commit 3ae858f

Please sign in to comment.