Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I translate strings using Java?

Tags:

java

string

I want a translation routine that allows me to translate any character to any other character or set of characters efficiently. The obvious way seems to be to use the value of a character from the input string as an index into a 256-entry translation array.

Given an initial array where each entry is set to its value, e.g. hex'37' would appear in the 56th entry (allowing 00 to be the first), the user could then substitute any characters required in the translate string.

e.g.1 I want to map a string with "A" for alphabetic characters, "N" for numeric characters, "B" for space characters and "X" for anything else. Thus "SL5 3QW" becomes "AANBNAA".

e.g.2. I want to translate some characters, such as "œ" (x'9D') to "oe" (x'6F65'), "ß" to "ss", "å" to "a", etc.

How do I get a numeric value from a character in the input string to use it as an index into the translate array?

It's easy with function CODE in Excel and straightforward in IBM assembler, but I can't track down a method in Java.

like image 378
Steve Avatar asked Sep 14 '25 14:09

Steve


1 Answers

This is a bit off topic, but if you want to do a comprehensive job of character translation, you cannot simply use String.charAt(int). Unicode codepoints larger than 65535 are represented in Java Strings as two consecutive char values.

The clean way to deal with this is to use the String.codepointAt(int) to extract each codepoint, and String.offsetByCodePoints(int, int) to step through the codepoint positions.

like image 53
Stephen C Avatar answered Sep 17 '25 04:09

Stephen C