Skip to content

Exploring the relationship between 16-bit Java chars and UTF-8 bytes. The APIs make it simple to convert between Java Strings and byte arrays by specifying the char encoding, but what if you're writing a font rendering engine? In this case, to properly support UTF-8, you need to map code points to glyphs.

License

Notifications You must be signed in to change notification settings

bryanwagner/so-you-think-you-know-java-strings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

so-you-think-you-know-java-strings

Exploring the relationship between 16-bit char types and UTF-8 bytes in Java. The APIs make it simple to convert between Java Strings and byte arrays by specifying the character encoding. For web development and text processing you usually don't need to think about character encoding details, but what if you're writing a font rendering engine? In this case, to properly support Unicode and UTF-8, you need to map code points to glyphs.

Since Unicode code points can have 32 bits of precision, they will not fit in a 16 bit Java char type. To handle Unicode Supplementary Characters, Java added code point-based APIs in JDK 1.5. This demo was put together after reading the excellent article Supplementary Characters in the Java Platform.

About

Exploring the relationship between 16-bit Java chars and UTF-8 bytes. The APIs make it simple to convert between Java Strings and byte arrays by specifying the char encoding, but what if you're writing a font rendering engine? In this case, to properly support UTF-8, you need to map code points to glyphs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages