Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

needs encoding conversion using LanguageEncoding in PPDs #1475

Closed
michaelrsweet opened this issue Mar 11, 2006 · 6 comments
Closed

needs encoding conversion using LanguageEncoding in PPDs #1475

michaelrsweet opened this issue Mar 11, 2006 · 6 comments
Milestone

Comments

@michaelrsweet
Copy link
Collaborator

Version: 1.2b2
CUPS.org User: kmuto.debian

Some PPDs have non-Latin strings, written in encoding of LanguageEncoding.
Although CUPS interface use them as is, it may cause a problem on some environments.

For example, most Japanese PPDs are written in Shift-JIS (LanguageEncoding: JIS83-RKSJ).
To show their option names/values on EUC-JP or UTF-8 encoding environment, CUPS needs to convert them.
Currently Web interface is running with UTF-8 and shows absoletely
unreadable option parameters screen by encoding mismatch.

As far as I know, some steps are needed to implement this.

  1. CUPS has Shift-JIS (JIS83-RKSJ) encoding map.
  2. When CUPS shows PPD name/variable, it converts LanguageEncoding to local encoding.
  3. When CUPS saves PPD name/variable somewhere, it converts local encoding to LanguageEncoding.
@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: mike

Reassigned to be pri-2 against 1.2; we should be able to do this automatically in ppdOpen2().

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: htl10

I am sorry - the suggestion in the initial post is not entirely clear
about what is local, and what's japanese, what's utf8 , etc. While
japanese is the dominant non-latin language in the IT world, the
problem is quite general. Without refering to specific languages/encoding, here is what I understand and what I think need to
be done:

There are three sources of non-ascii texts:
(1) in ppd from vendor, sometimes have localised strings.
(2) cups's internal processing in utf8, and its presentation
of internal data to the web interface and user's input from the
web interface, which also naturally tends to be utf8.
(3) the user's terminal environment, when running cups's
command-line interfaces.

Details (1) and (3) are not controlled by cups - so they have to
be converted on the fly (e.g. via libiconv in the case
of BSD-derivatives, Mac OS X or Tru64 or glibc's built-in iconv)
based on user's environment in the case of (3) and specific ppd
fields in (1). For (2) there is actually a choice - whether cups
honours the prefered encoding of the browser and converts its
internal data on-the-fly, or feed the browser utf8 converted
regardless.

I think the initial poster missed a point - the prefered
encoding of the user's browser session may not be the same
as the encoding of the ppd's, so there can be two conversions
involved in any case. Simply changing cups to use
the ppd's encoding to dispatch web pages may not be the best route.

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: mike

htl10:

First, we only provide UTF-8. Web browsers must support UTF-8 to be standards-compliant, and CUPS only uses UTF-8 internally. The current behavior of not converting UI text strings is an error.

Second, the point is that we will transcode from the PPD encoding to UTF-8, so that we always have UTF-8 (your #1)

Finally, when displaying to the console, we transcode from UTF-8 to the locale-defined character set (your #3), although not by using any of the APIs you named since we can't depend on iconv being available or supporting UTF-8 (yes, really!)

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: mike

Fixed in Subversion repository.

Please let me know if you run into any problems - seems to work OK for ISOLatin1 and ShiftJIS.

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: kmuto.debian

Thanks!

It's very close to solution.

One problem of the code is that it doesn't care 0xff mask for 2nd character. Following code (contributed by Fumitoshi Ukai, Debian
developer) fixes this. I checked it worked perfectlly.

--- cups/transcode.c 2006-03-16 10:38:55.000000000 +0900
+++ cups/transcode.c 2006-03-16 11:56:51.000000000 +0900
@@ -966,7 +966,7 @@
if (!*src)
return (-1);

  •  legchar = (legchar << 8) | (cups_vbcs_t)*src++;
    
  •  legchar = (legchar << 8) | (cups_vbcs_t)(*src++ & 0xffU);
    

    /*

    • Convert unknown character to Replacement Character...

@michaelrsweet
Copy link
Collaborator Author

CUPS.org User: mike

Thanks, that did indeed fix things, although I also needed to apply the fix to the 3 and 4 byte code, too. The test program now works, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant