I am searching raw twitter snippets using R but keep getting issues where there are non standard Alphanumeric chars such as the following "🏄".
I would like to take out all non [abcdefghijklmnopqrstuvwxyz0123456789] characters using gsub.
Can you use gsub to specify a replace for those items NOT in [abcdefghijklmnopqrstuvwxyz0123456789]?
You could simply negate you pattern with [^ ...]:
x <- "abcde🏄fgh"
gsub("[^A-Za-z0-9]", "", x)
# [1] "abcdefgh"
Please note that the class [:alnum:] matches all your given special characters. That's why gsub("[^[:alnum:]]", "", x) doesn't work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With