Java Character and Sting Data Type and Operations

Java Character and String Data Type and Operations
In Java the character data type, char, is used to represent a single character. When assigning a character to a variable, a character literal is enclosed in single quotation marks. Example: char myLetter = 'A'; char numChar = '4'; In the first statement the character A is assigned to the char variable myLetter, while the second statement assigns the digit character 4 to the char variable numChar. NB: A string literal is enclosed in (double) quotation marks (" "). A character literal is a single character enclosed in single quotation marks. So when encountered in the code, "A" is a string, and 'A' is a character.

Unicode and ASCII Code in Java
Computers use binary numbers internally (machine language). This means that, a character is stored as a sequence of 0s and 1s in the computer. The process of converting a character to its binary representation is called encoding. There are different ways to encode a character but how characters are encoded in a computer is defined by the encoding scheme. Java supports Unicode, this is an encoding scheme established by the Unicode Consortium to support the interchange, processing, and display of written texts in the world's diverse languages. During its inception, Unicode was originally designed as a 16-bit character encoding scheme. The primitive data type char was intended to take advantage of this design by providing a simple data type that could hold any character. It however, turned out that the 65,536 characters possible in a 16-bit encoding were not sufficient to represent all the characters present in the world. As a result of this, the Unicode standard was therefore extended in order to allow up to 1,112,064 characters. Those characters that go beyond the original 16-bit limit are called supplementary characters. JDK 1.5 supports supplementary characters. These characters can be stored in a char type variable. A 16-bit Unicode takes two bytes, preceded by \u, expressed in four hexadecimal digits that run from '\u0000' to '\uFFFF'. For example, the word "welcome" is translated into Chinese using two characters. The Unicodes of these two characters are "\u6B22\u8FCE". DisplayUnicode.java: import javax.swing.JOptionpane; public class DisplayUnicode { public static void main (string[ ] args) { JOptionPane.showMessageDialog (null, "\u6B22\u8FCE \u03bl \u03b2 \u03b3", "\u6B22\u8FCE Welcome", JOptionPane.INFORMATION_MESSAGE); } } Ifthere is no Chinese font installed on the system, then you will not be able to see the Chinese characters. The Unicodes for the Greek letters are a ß ? \u03b1 \u03b2 \u03b3. Most computers use the ASCII (American Standard Code for Information Interchange), this is a 7-bit encoding scheme for representing all uppercase and lowercase letters, digits, punctuation marks, and control characters. Unicode includes the ASCII code, with '\u0000' to '\u007F' corresponding to the 128 ASCII characters.

You can use ASCII characters like 'X', '1', and '$' in a Java program as well as Unicodes. Thus, for example, the following statements are equivalent: char letter = 'A'; char letter = '\u0041'; // Character A's Unicode is 0041 Both statements assign character A to char variable letter.

NB: "The increment and decrement operators can also be used on char variables to get the next or preceding Unicode character. Example:" char ch = 'a'; System.out.println(++ch); //displays b

Escape Sequences for Special Characters
Java contains escape sequences that allows you to represent special characters. An escape sequence begins with the backslash character (\) followed by the character that has a special meaning to the Java compiler.

Lets print the quoted message below: Ahoy, there "Java is fun" The print statement should be: System.out.println("Ahoy, there \"Java is fun\"");

2.9.3. Casting Between char and Numeric Types 1. A char type can be cast into any numeric type, and a numeric type can also be cast into a char type. When an integer is cast into a char, only its lower sixteen bits of data are used; the other part is ignored. Example: char xter = (char)0XAB0041; // the lower 16 bits hex code 0041 is assigned to xter System.out.println(xter);  // xter is character A 2. When a floating-point value is cast into a char, then it is the integral part of the floating-point value that is cast into a char. char xter = (char)65.25;  // decimal 65 is assigned to xter System.out.println(xter); // xter is character A When a char type is cast into a numeric type, the character's Unicode is cast into the specified numeric type. int n = (int)'A';    // the Unicode of character A is assigned to n   System.out.println(n);// n is 65 Implicit casting can be used only if the result of a casting fits into the target variable. Otherwise, then it is explicit casting that must be used. Example: byte b = 'a'; int i = 'a'; But the following casting is incorrect, because the Unicode \uFFF4 cannot fit into a byte: byte b = '\uFFF4'; To force assignment, then use explicit casting: byte b = (byte)'\uFFF4'; Any positive integer between 0 and FFFF in hexadecimal can be cast into a character implicitly. Any number not in this range must be cast into a char explicitly. NB: "1. All numeric operators can be applied to char operands. A char operand is automatically cast into a number if the other operand is a number or a character. If the other operand is a string, the character is concatenated with the string. For example, the following statements." int n = '2' + '3'; // (int)'2' is 50 and (int)'3' is 51 System.out.println("n is " + n); int j = 2 + 'a'; // (int)'a' is 97 System.out.println("j is " + j); System.out.println(j + " is the Unicode for character " + (char)j); System.out.println("Chapter" + '2');

2. The Unicodes for lowercase letters are consecutive integers starting from the Unicode for 'a', then for 'b', 'c', upto 'z'. The same is true for the uppercase letters. The Unicode for 'a' is greater than the Unicode for 'A'. So 'a' - 'A' is the same as 'b' - 'B'. The lowercase letter bc, its corresponding uppercase letter is (char)('A' + (bc - 'a')).

The String Type
The limitation of the char type is that it only represents one character. To represent a string of characters, then the we can use the data type called String. Example: String message = "Welcome to Java"; The String type is actually a predefined class in the Java library just like the System class and JOptionPane class. The String type is not a primitive type. It is known as a reference type. Any Java class can be used as a reference type for a variable. Java allows the concatenation of strings. You can use the concatenation operator or the plus sign (+) if one of the operands is a string. If one of the operands is a non-string (e.g., a number), then the non-string value is converted into a string and concatenated with the other string. Examples: // Three strings are concatenated String message = "Welcome" + "to" + "Java";

// String Chapter is concatenated with number 4 String disp = "Chapter" + 4; // disp becomes Chapter4

// String Supplement is concatenated with character D  String s1 = "Supplement" + 'D'; // s1 becomes SupplementD If neither of the operands is a string, then the plus sign (+) is the addition operator that adds two numbers (returns the sum). The shorthand += operator can also be used for string concatenation. For example, the following code appends the string "and Java is cool" with the string "Ahoy to Java" in message. message += " and Wow Java is fun"; So the new message is "Welcome to Java and Wow Java is fun". Suppose that i = 1 and j = 2, what is the output of the following statement? System.out.println("i + j is " + (i + j); The output is "i + j is 12" because "i + j is " is concatenated with the value of i first. To force i + j to be executed first, enclose i + j in the parentheses, as follows: System.out.println("i + j is " + (i + j));