Character literals?

Using the Java SDK with TextPad

Moderators: AmigoJack, helios, bbadmin, Bob Hansen, MudGuard

Post Reply
User avatar
jon80
Posts: 24
Joined: Thu May 28, 2009 10:03 am

Character literals?

Post by jon80 »

Why is the following incorrect syntax?

class CharactersInPlay
{

public static void main (String[] args)
{
char a = 'u\16848'; //I tried escaping i.e. 'u\\16848', and, also other characters
System.out.println(a);
}

}

References
Banum Unicode Chart at http://www.unicode.org/charts/PDF/U16800.pdf.
SCJP 1.6 Exam Prep written by K. Bates and Sierra. P.189
Jon
ben_josephs
Posts: 2456
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

Java's chars are 16 bits (4 hex digits) wide, enough to hold the characters of the basic multilingual plane.

If you need to use wider ("supplementary") characters you'll have to use surrogate pairs.

Or ints.
User avatar
jon80
Posts: 24
Joined: Thu May 28, 2009 10:03 am

Post by jon80 »

ben_josephs wrote:Java's chars are 16 bits (4 hex digits) wide, enough to hold the characters of the basic multilingual plane.

If you need to use wider ("supplementary") characters you'll have to use surrogate pairs.

Or ints.
UTF-16 (16-bit Unicode Transformation Format) is a character encoding for Unicode capable of encoding 1,112,064[1] numbers (called code points) in the Unicode code space from 0 to 0x10FFFF. It produces a variable-length result of either one or two 16-bit code units per code point.
http://en.wikipedia.org/wiki/UTF-16/UCS ... U.2B10FFFF

A code point is a code value that is associated with a character in an encoding scheme. in the Unicode standard, code points are written in hexadecimal and prefixed with U+, such as U+0041 for the code point of the letter A. Unicode has code points that are grouped into 17 code planes.

The first code plane, called the basic multilingual plane, consists of the classic Unicode characters with code points U+0000 to U+FFFF. Sixteen additional planes, with code points U+10000 to U10FFFF, hold the supplementary characters.

Cora Java Vol I (P.43)

Can you provide an example?
Jon
ben_josephs
Posts: 2456
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

Indeed. The page you refer to also explains
Code points from the other planes (called Supplementary Planes) are encoded in UTF-16 by pairs of 16-bit code units called a surrogate pair...

Did you read the whole of that page or the Oracle Java page it refers to?
User avatar
jon80
Posts: 24
Joined: Thu May 28, 2009 10:03 am

Post by jon80 »

ben_josephs wrote:Indeed. The page you refer to also explains
Code points from the other planes (called Supplementary Planes) are encoded in UTF-16 by pairs of 16-bit code units called a surrogate pair...

Did you read the whole of that page or the Oracle Java page it refers to?
Yes, but I am down with coffee today and require an example, if possible.
Jon
ben_josephs
Posts: 2456
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

How you do it depends on the requirements of whatever function you're going to pass it to.

And I've never done it, so I don't have an example.
User avatar
jon80
Posts: 24
Joined: Thu May 28, 2009 10:03 am

Post by jon80 »

ben_josephs wrote:How you do it depends on the requirements of whatever function you're going to pass it to.

And I've never done it, so I don't have an example.
Oh ic, well I am reading the document at http://www.unicode.org/versions/Unicode6.1.0/ch02.pdf.
Jon
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Re: Character literals?

Post by MudGuard »

jon80 wrote:Why is the following incorrect syntax?
char a = 'u\16848'; //I tried escaping i.e. 'u\\16848', and, also
first of all, it is an escaped u that starts an unicode character:
'\u0064'

Codes above \uFFFF must be encoded in two parts.
For details see
http://java.sun.com/developer/technical ... lementary/

(whether a java char or a Character can hold unicode characters above \uffff i don't know - I know they can be used in a string)
User avatar
jon80
Posts: 24
Joined: Thu May 28, 2009 10:03 am

Re: Character literals?

Post by jon80 »

okay I will have a look, thanks
Jon
Post Reply