Java
char data types based on UTF-16 format. The Unicode standard has
changed latter to more than 16 bits. Currently the range of legal
code points is now U+0000 to U+10FFFF.
Basic
Multilingual Plane
The
set of characters from U+0000 to U+FFFF is sometimes referred to as
the Basic Multilingual Plane.
supplementary
characters
Characters
whose code points are greater than U+FFFF are called supplementary
characters.
Java
platform represents uni code characters in UTF-16 format.
Supplementary characters are represented as a pair of char values,
the first from the high-surrogates range, (\uD800-\uDBFF), the second
from the low-surrogates range (\uDC00-\uDFFF).
Note:
- The methods that only accept a char value cannot support supplementary characters.
- The methods that accept an int value support all Unicode characters, including supplementary characters.
No comments:
Post a Comment