I need a function
function getCharType(c)
local i = string.byte(c) -- works only for 1 byte chars
if (i > 48) and (i < 57) then return 1 end
if (i > 97) and (i < 122) then return 2 end
return 0
end
which should return
2 - if c is a letter
1 - if c is a digit
0 - if c is a symbol (anything else)
c itself will already be a lower case character: charType = getCharType(string.lower(Character)). If Unicode characters are possible, that would be fine.
With the above getCharType("ö") is 0.
To find out whether a non-ASCII character is an uppercase or lowercase letter or a number, you need Unicode data. Module:Unicode data on Wikipedia has a function like this that uses Module:Unicode data/category (data for the General Category of Unicode characters).
Here's an adaptation of the lookup_category function from Module:Unicode data. I haven't included the Unicode data (Module:Unicode data/category); you will have to copy it from the link above.
local category_data -- set this variable to the table in Module:Unicode data/category above
local floor = math.floor
local function binary_range_search(code_point, ranges)
local low, mid, high
low, high = 1, #ranges
while low <= high do
mid = floor((low + high) / 2)
local range = ranges[mid]
if code_point < range[1] then
high = mid - 1
elseif code_point <= range[2] then
return range
else
low = mid + 1
end
end
return nil
end
function get_category(code_point)
if category_data.singles[code_point] then
return category_data.singles[code_point]
else
local range = binary_range_search(code_point, category_data.ranges)
return range and range[3] or "Cn"
end
end
The function get_category takes a code point (a number) and returns the name of the General Category. I guess the categories you are interested in are Nd (number, decimal digit) and the categories that begin with L (letter).
You will need a function that converts a character to a codepoint. If the file is encoded in UTF-8 and you are using Lua 5.3, you can use the utf8.codepoint function: get_category(utf8.codepoint('ö')) will result in 'Ll'. You can convert category codes to the number value that your function above uses: function category_to_number(category) if category == "Nd" then return 1 elseif category:sub(1, 1) == "L" then return 2 else return 0 end end.
Works only with ASCII characters (not Unicode)
function getCharType(c)
return #c:rep(3):match(".%w?%a?")-1
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With