needs encoding conversion using LanguageEncoding in PPDs #1475

michaelrsweet · 2006-03-11T09:27:27Z

Version: 1.2b2
CUPS.org User: kmuto.debian

Some PPDs have non-Latin strings, written in encoding of LanguageEncoding.
Although CUPS interface use them as is, it may cause a problem on some environments.

For example, most Japanese PPDs are written in Shift-JIS (LanguageEncoding: JIS83-RKSJ).
To show their option names/values on EUC-JP or UTF-8 encoding environment, CUPS needs to convert them.
Currently Web interface is running with UTF-8 and shows absoletely
unreadable option parameters screen by encoding mismatch.

As far as I know, some steps are needed to implement this.

CUPS has Shift-JIS (JIS83-RKSJ) encoding map.
When CUPS shows PPD name/variable, it converts LanguageEncoding to local encoding.
When CUPS saves PPD name/variable somewhere, it converts local encoding to LanguageEncoding.

michaelrsweet · 2006-03-11T11:36:10Z

CUPS.org User: mike

Reassigned to be pri-2 against 1.2; we should be able to do this automatically in ppdOpen2().

michaelrsweet · 2006-03-11T23:26:32Z

CUPS.org User: htl10

I am sorry - the suggestion in the initial post is not entirely clear
about what is local, and what's japanese, what's utf8 , etc. While
japanese is the dominant non-latin language in the IT world, the
problem is quite general. Without refering to specific languages/encoding, here is what I understand and what I think need to
be done:

There are three sources of non-ascii texts:
(1) in ppd from vendor, sometimes have localised strings.
(2) cups's internal processing in utf8, and its presentation
of internal data to the web interface and user's input from the
web interface, which also naturally tends to be utf8.
(3) the user's terminal environment, when running cups's
command-line interfaces.

Details (1) and (3) are not controlled by cups - so they have to
be converted on the fly (e.g. via libiconv in the case
of BSD-derivatives, Mac OS X or Tru64 or glibc's built-in iconv)
based on user's environment in the case of (3) and specific ppd
fields in (1). For (2) there is actually a choice - whether cups
honours the prefered encoding of the browser and converts its
internal data on-the-fly, or feed the browser utf8 converted
regardless.

I think the initial poster missed a point - the prefered
encoding of the user's browser session may not be the same
as the encoding of the ppd's, so there can be two conversions
involved in any case. Simply changing cups to use
the ppd's encoding to dispatch web pages may not be the best route.

michaelrsweet · 2006-03-12T00:37:58Z

CUPS.org User: mike

htl10:

First, we only provide UTF-8. Web browsers must support UTF-8 to be standards-compliant, and CUPS only uses UTF-8 internally. The current behavior of not converting UI text strings is an error.

Second, the point is that we will transcode from the PPD encoding to UTF-8, so that we always have UTF-8 (your #1)

Finally, when displaying to the console, we transcode from UTF-8 to the locale-defined character set (your #3), although not by using any of the APIs you named since we can't depend on iconv being available or supporting UTF-8 (yes, really!)

michaelrsweet · 2006-03-15T04:36:35Z

CUPS.org User: mike

Fixed in Subversion repository.

Please let me know if you run into any problems - seems to work OK for ISOLatin1 and ShiftJIS.

michaelrsweet · 2006-03-16T03:02:54Z

CUPS.org User: kmuto.debian

Thanks!

It's very close to solution.

One problem of the code is that it doesn't care 0xff mask for 2nd character. Following code (contributed by Fumitoshi Ukai, Debian
developer) fixes this. I checked it worked perfectlly.

--- cups/transcode.c 2006-03-16 10:38:55.000000000 +0900
+++ cups/transcode.c 2006-03-16 11:56:51.000000000 +0900
@@ -966,7 +966,7 @@
if (!*src)
return (-1);

 legchar = (legchar << 8) | (cups_vbcs_t)*src++;

 legchar = (legchar << 8) | (cups_vbcs_t)(*src++ & 0xffU);

/*

Convert unknown character to Replacement Character...

michaelrsweet · 2006-03-16T03:16:04Z

CUPS.org User: mike

Thanks, that did indeed fix things, although I also needed to apply the fix to the 3 and 4 byte code, too. The test program now works, too.

michaelrsweet closed this as completed Mar 16, 2006

michaelrsweet added the priority-low label Mar 17, 2016

michaelrsweet added this to the Stable milestone Mar 17, 2016

michaelrsweet mentioned this issue Mar 17, 2016

PPD encoding ignored in load_ppds #1503

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

needs encoding conversion using LanguageEncoding in PPDs #1475

needs encoding conversion using LanguageEncoding in PPDs #1475

michaelrsweet commented Mar 11, 2006

michaelrsweet commented Mar 11, 2006

michaelrsweet commented Mar 11, 2006

michaelrsweet commented Mar 12, 2006

michaelrsweet commented Mar 15, 2006

michaelrsweet commented Mar 16, 2006

michaelrsweet commented Mar 16, 2006

needs encoding conversion using LanguageEncoding in PPDs #1475

needs encoding conversion using LanguageEncoding in PPDs #1475

Comments

michaelrsweet commented Mar 11, 2006

michaelrsweet commented Mar 11, 2006

michaelrsweet commented Mar 11, 2006

michaelrsweet commented Mar 12, 2006

michaelrsweet commented Mar 15, 2006

michaelrsweet commented Mar 16, 2006

michaelrsweet commented Mar 16, 2006