Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using if/else statement to insert a decimal for a column based on starting letter and string length of the row using R

I have a data frame "df" and want to apply if/else conditions to insert a decimal for the entire column "A"

A         B
E0505   123
890      43
4505     56 

Rules to apply:

  1. If the code starts with "E" and length of the code is > 4: between character 4 and 5.
  2. If length of the code is > 3 and the code doesn't start with "E": between character 3 and 4.
  3. If length of the code is <= 3: return the code as such.

Final output:

A          B
E050.5   123
890       43
450.5     56

I have tried this, but I am not sure how to include the condition where row starts with E or not.

ifelse(str_length(df$A)>3, as.character(paste0(substring(df$A, 1, 3),".", substring(df$A, 4))), as.character(df$A))
like image 653
thecoder Avatar asked Dec 07 '25 05:12

thecoder


1 Answers

Use sub with regular expression, you can do this:

df$A <- sub("((?:^E.|^[^E]).{2})(.+)", "\\1.\\2", df$A)

df
#       A   B
#1 E050.5 123
#2    890  43
#3  450.5  56

((?:^E.|^[^E]).{2})(.+) matches strings:

  • case 1: starts with E followed by 4 or more characters, in which case capture the first 4 characters and the rest as two separate groups and insert . between;
  • case 2: not starts with E but have 4 or more characters, in which case capture the first 3 characters and the rest as two separate groups and insert . between;

Strings starting with E and has less than 5 characters in total or not starting with E and has less than 4 characters in total are not matched, and will not be modified.


If ignoring case: df$A <- sub("((?:^[Ee].|^[^Ee]).{2})(.+)", "\\1.\\2", df$A).

like image 52
Psidom Avatar answered Dec 08 '25 18:12

Psidom