Strings in terms of codepoints? Kill Java char type!

January 17, 2008

Wrote a message to dev@openjfx-compiler
https://openjfx-compiler.dev.java.net/servlets/ReadMsg?list=dev&msgNo=1844
copy it here:
I think, JavaFX should have a separate String class. Which should be
automatically converted to Java’s string and back. Why?

In Java 6 the was introduced idea of code points, because Java strings
is no longer a Unicode strings (currently Unicode character does not fit
16 bit char type. So Java’s native encoding is no longer Unicode but
UTF-16. This means that char type is just 16bit and is not a character.
The charactet(one visible sign) is a “code point”:
see method
public int codePointAt(int index) in
http://java.sun.com/javase/6/docs/api/java/lang/String.html#codePointAt(int)

SO I propose to have special class for String in JavaFX.

package javafx.lang;
class String {
	length() : Integer; // Actually retuns Java's codePointCount()
	charAt(i: Integer) : String; // Actually returns Java's codepoint
	... + and all useful methods from Java's String.
         ... - all methods for code points
         ... + may be some methods to work with char's but make their
names so that anybody understands they are auxillary
}

After introduction of code points existing Swing code should actually be
rewritten to work in terms of code points. Because if you have input
field for username as 8 characters, the user expects he/she could enter
8 real charaters. But if you apply limit of 8 to
java.lang.String.length() you will allow to enter only 4 of some new
Unicode characters from extended set. I think it would be good if JavaFX
users forgot about char and knew nothing about difference in length()
and codePointsCount().

Currently I see that JavaFX only partially hides presence of Character type:

var ca = ClassA {
     s: "Hello!";
}

System.out.println(ca.s.length());
var c = ca.s.charAt(0);
System.out.println(c.getClass());

it prints:

6
class java.lang.Character

however because idea of JavaFX to have only Integer Number String
Boolean i would expect c to be of type Integer. But this is not so.

Also may be having separate String class will allow to have not nullable
string attribute? As we currently have for Integer, Number and Boolean?

Please consider!

Tags:

Leave a Reply