I am using a regex to program an input validator for a text box where I only want alphabetical characters. I was wondering if [A-z] and [a-zA-Z] were equivalent or if there were differences performance wise.
I keep reading [a-zA-Z] on my searches and no mention of [A-z].
I am using java's String.matches(regex).
Using character sets For example, the regular expression "[ A-Za-z] " specifies to match any single uppercase or lowercase letter. In the character set, a hyphen indicates a range of characters, for example [A-Z] will match any one capital letter.
The character class [a-zA-Z] matches any character from a to z or A to Z.
[A-z] will match ASCII characters in the range from A to z, while [a-zA-Z] will match ASCII characters in the range from A to Z and in the range from a to z. At first glance, this might seem equivalent -- however, if you look at this table of ASCII characters, you'll see that A-z includes several other characters. Specifically, they are [, \, ], ^, _, and ` (which you clearly don't want).
When you take a look at the ASCII table, you will see following:
A = 65 Z = 90 a = 97 z = 122 So, [A-z] will match every char from 65 to 122. This includes these characters (91 -> 96) as well:
[\]^_` This means [A-Za-z] will match only the alphabet, without the extra characters above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With