Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Full-width character block (CJK) breaks rendering #931

Closed
friedbunny opened this issue Feb 28, 2015 · 13 comments · Fixed by #946
Closed

Full-width character block (CJK) breaks rendering #931

friedbunny opened this issue Feb 28, 2015 · 13 comments · Fixed by #946

Comments

@friedbunny
Copy link
Contributor

The fontstack can't serve the full-width latin characters that are unfortunately common in Chinese, Japanese, and Korean (CJK), which results in entire sections of the map failing to draw when they're used. This is especially problematic when using country-localized labels (i.e., name and not name_en) in a style.

EDIT: Kanji and other CJK character sets render without any problems, only this specific block fails.

Currently, attempting to use these full-width characters throws an exception when trying to load the remote glyph PBF:

[ERROR] ParseTile: Parsing [8/227/99] failed: [ERROR] failed to load glyphs: HTTP status code 400; 
Raw response: {"message":"Invalid Range"}; 
URL: https://api.tiles.mapbox.com/v4/fontstack/Open%20Sans%20Regular%2c%20Arial%20Unicode%20MS%20Regular/65280-65533.pbf?access_token=pk.eyJ1IjoiZnJpZWRidW5ueSIsImEiOiJUbFBmei1zIn0.lZVQjkwWlc07RS4oEQITqg

Here is an example of full-width numbers on a Japanese highway on OSM. The left section of the tunnel is correct (国道41号), while the right has full-width numbers (国道41号).

Ultimately, the fontstack needs to be able to serve full-width latin characters or MBGL should replace them with regular latin, half-width characters. #930 only temporarily fixes the drawing issues by returning a common glyph range, which means the full-width characters don't get drawn. (Mapbox-GL-JS currently has a similar behavior.)

Cheers!

@friedbunny
Copy link
Contributor Author

BTW: To reproduce here on native, just zoom around Japan for a while — you'll notice the above error and lots of unrendered blocks, even with the stock name_en.

@kkaefer
Copy link
Member

kkaefer commented Mar 2, 2015

To clarify, does this only affect full-width latin characters, like the numbers used on highways, or does it affect all full-width characters? It seems like regular full-width (like 国道) are rendering just fine.

@friedbunny
Copy link
Contributor Author

It only affects the characters and symbols in range 65280-65535 (i.e., at the end of the basic unicode range), so far as I can tell, sorry. I've edited the original post to emphasize that kanji/etc aren't broken.

This "Halfwidth and Fullwidth Forms" block includes full-width latin alphabet, roman numbers, basic symbols, and half-width Japanese and Korean syllabaries. The half-width Japanese and Korean aren't as common, relatively, but I'm sure somebody has tried to use them on OSM, too.

@friedbunny friedbunny changed the title Full-width characters (CJK) break rendering Full-width character block (CJK) breaks rendering Mar 2, 2015
@jfirebaugh
Copy link
Contributor

It looks like there are multiple issues here.

The range in the URL above -- 65280-65533 -- is indeed invalid from the fontstack API's perspective. It should be 65280-65535. So gl-native is somehow generating invalid ranges.

But, when I feed the correct range to the fontstack, it produces a 500 error. So there's probably a boundary condition error there too.

@friedbunny
Copy link
Contributor Author

mbgl/text/glyph.cpp purposely clips the range short:

// Note: this only works for the BMP
GlyphRange getGlyphRange(char32_t glyph) {
    unsigned start = (glyph/256) * 256;
    unsigned end = (start + 255);
    if (start > 65280) start = 65280;
    if (end > 65533) end = 65533;
    return { start, end };
}

@mikemorris
Copy link
Contributor

Hah, this could very well be a typo - possible that the intended max extent should actually be 0xFFFF or 65535 to properly contain the Halfwidth and Fullwidth Forms and Specials blocks. We currently aren't making any effort to support blocks in the 0x10000 to 0xE007F range because of gaps, although perhaps we should?

Check out node-fontnik for the code that's actually pulling these glyphs from Freetype for more details @friedbunny.

@mikemorris
Copy link
Contributor

@jfirebaugh The 500 error is because a 65280-65535.pbf file doesn't exist. The actual block is 65280-65533.pbf, so the blocks are created invalid too, not just requested invalid. The fontstack API is just (correctly) catching the invalid range before it attempts to actually retrieve the block.

@mikemorris
Copy link
Contributor

@friedbunny
Copy link
Contributor Author

Thanks @mikemorris, I didn't know where fontstack came from, repo-wise. That looks extremely promising. 👍

@jfirebaugh
Copy link
Contributor

@mikemorris So the block is supposed to be 65280-65535 everywhere, and both places where there's a special case for 65533 are wrong?

@mikemorris
Copy link
Contributor

Yep, correct @jfirebaugh

@friedbunny
Copy link
Contributor Author

@mikemorris Has mapbox/node-fontnik/pull/72 found its way into production? The fontstack request for this glyph range still returns ERROR 500: Internal Server Error.

@friedbunny
Copy link
Contributor Author

The above fontstack fix has landed in production and #946 will fix everything here once it's merged, yay!

fixed

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants