What is the preferred way of determining if a string contains non-Roman/non-English (e.g., ないでさ) characters?
You could use regex/grep to check for hex values of characters outside the range of printable ASCII characters:
x <- 'ないでさ'
grep( "[^\x20-\x7F]",x )
#[1] 1
grep( "[^\x20-\x7F]","Normal text" )
#integer(0)
If you wanted to allow the non-printing ("control") character to be considered "English", you could extend the range of the character class in hte first argument to grep to start with "\x01". See ?regex for more information on using character class argumets. See ?Quotes for more information about how to specify characters as Unicode, hexadecimal, or octal values.
The R.oo package has conversion functions that may be useful:
library(R.oo)
?intToChar
?charToInt
The fact that Henrik Bengtsson saw fit to include these in his package says to me that there is no a handy method to do this in base/default R. He's a long-time useR/guRu.
Seeing the other answer prompted this effort which seems straight-forward:
> is.na( iconv( c(x, "OrdinaryASCII") , "", "ASCII") )
[1] TRUE FALSE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With