Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Format String in Java regarding its length in bytes

Tags:

java

formatter

Let's say I have a String containing non ASCII characters in Java and I want to use String.format() to set it so the formatted String will have a minimum width regarding of the string's byte length.

String s = "æøå";
String.format(l, "%" + 10 + "s" , s);

This will result in a string with 7 leading white space.

But what I want is there's should be only 4 leading white space since the original string is 6 bytes in size.

This seems to be a common requirement so I would like to ask if there's any already-built class that can achieve this, or should I go to implement the Formattable interface myself?

like image 978
Quincy Avatar asked Apr 02 '26 16:04

Quincy


2 Answers

A string doesn't have a number of bytes - it has a number of characters. The number of bytes it takes to represent a string depends on the encoding you use. I don't know of anything built-in to do what you want in terms of the padding (I don't think it is that common a requirement). You can ask a CharsetEncoder for the maximum and average number of bytes per character, but I don't see any way of getting the number of bytes for a particular string without basically doing the encoding:

Charset cs = Charset.forName("UTF-8");
ByteBuffer buffer = cs.encode("foobar");
int lengthInBytes = buffer.remaining();

If you're going to encode the string anyway, you might want to just perform the encoding, work out how much padding is required, then write the encoded padding out, then write the already-encoded text. It really depends on what you're doing with the data.

like image 154
Jon Skeet Avatar answered Apr 04 '26 06:04

Jon Skeet


String s ="æøå";
int size = s.getBytes("UTF8").length;
String.format("%" + (10 - size) + "s" , s); 
like image 25
stacker Avatar answered Apr 04 '26 05:04

stacker