Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can a empty java string be created from non-empty UTF-8 byte array?

Tags:

java

string

utf-8

I'm trying to debug something and I'm wondering if the following code could ever return true

public boolean impossible(byte[] myBytes) {
  if (myBytes.length == 0)
    return false;
  String string = new String(myBytes, "UTF-8");
  return string.length() == 0;
}

Is there some value I can pass in that will return true? I've fiddled with passing in just the first byte of a 2 byte sequence, but it still produces a single character string.

To clarify, this happened on a PowerPC chip on Java 1.4 code compiled through GCJ to a native binary executable. This basically means that most bets are off. I'm mostly wondering if Java's 'normal' behaviour, or Java's spec made any promises.

like image 337
Steve Armstrong Avatar asked Oct 18 '25 15:10

Steve Armstrong


1 Answers

According to the javadoc for java.util.String, the behavior of new String(byte[], "UTF-8") is not specified when the bytearray contains invalid or unexpected data. If you want more predictability in your resultant string use http://java.sun.com/j2se/1.5.0/docs/api/java/nio/charset/CharsetDecoder.html.

like image 134
Trey Avatar answered Oct 21 '25 05:10

Trey



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!