Given input vector (iv)
iv <- c(.10,.15,"hello","."," . ",". ")
I'm using:
out <- sub(regexp,NA,iv)
I want output vector like this:
.10,.15,"hello",NA,NA,NA
but, don't know how to form the regexp to get what I need. Thanks in advance.
What you're looking for is negative lookahead in regular expressions. You want to check for . not followed by a number (0-9) and replace them with NA. If this logic is what you want, then it can be implemented in 1 line as follows:
gsub("\\.(?![0-9])", NA, iv, perl=T)
# [1] "0.1" "0.15" "hello" NA NA NA
Logic: search for a dot that is not followed by a number and replace them with NA.
if you want to replace the values with NA then you will want to use some form of the assignment operators.
A simple approach:
iv[gsub(" ", "", iv)=="."] <- NA
quick explanation:
If the strings to replace were all the same (ie, "."), then you could simply call
iv[ iv=="."] <- NA.
However, in order to catch all the extra spaces, you can either search for the myriad "." combinations making sure to exclude the .10, .15 etc, or instead
you can remove all the spaces and then you have the simpler situation where you can use ==.
Incidentally, if you want to search for a period in regex in R, you need to escape the period for regex \. and then you need to escape the escape for R, \\.
Edit: Note that the line above does not permanently remove the spaces from iv. Take a look at gsub(" ", "", iv)=="." This returns a vector of T/F, which in turn is being used to filter iv. Other than the NA values, iv remains unchanged.
EDIT #2: If you want the changes to be saved to a different vector, you can use the following:
out <- iv
out[gsub(" ", "", iv)=="."] <- NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With