I was working on a toy project and tried using some unicode variable names to match a paper I was attempting to implement.
The following code works fine on R 3.4.3 on Windows (RStudio version 1.1.456) and R 3.5.1 on OSX:
> µ <- function(ß, n) ß * n
> µ(2, 3)
[1] 6
This code gives the following error, with α typed as ALT+224:
> α <- 2
Error: unexpected input in "\"
The file was saved as UTF-8, so this is surprising to me.
make.names is consistent with the results above:
> make.names('µ')
[1] "µ"
> make.names('α')
[1] "a"
What is the rule for non-ASCII letters, why are mu and scharfes OK but alpha isn't?
Edit: Output of sessionInfo()
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.4.3 tools_3.4.3 yaml_2.2.0
Edit2: It seems like Sys.setlocale should be the answer, but here is what happens when I try this:
> Sys.setlocale("LC_ALL", 'en_US.UTF-8')
[1] ""
Warning message:
In Sys.setlocale("LC_ALL", "en_US.UTF-8") :
OS reports request to set locale to "en_US.UTF-8" cannot be honored
Working with Ben Bolker we determined the issue was that the current session was using character encoding Windows-1252, which has some non-ASCII characters but not many. This is despite the fact that RStudio saved the file as UTF-8.
Attempting to change the current collation of a running R session does not seem to be possible? At least on Windows I get a warning (see the question and here).
I have a partial solution, if someone finds themselves in the situation where they are given a file like this and want to run it and have interactive access to the results, the following will mostly work (variables will be translated to Win-1252):
> source('utf-8-file.r', encoding='UTF-8')
I would be very excited to see a better solution, one which allows editing and running the file and entering such snippets into the console of RStudio on Windows.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With