I have the following data,
id <- c("case1", "case19", "case88", "case77")
vec <- c("One_20 (19)",
         "tWo_20 (290)",
         "Three_38 (399)",
         NA)
df <- data.frame(id, vec)
> df
      id            vec
1  case1    One_20 (19)
2 case19   tWo_20 (290)
3 case88 Three_38 (399)
4 case77           <NA>
I want to separte the vec vector into two variables, namely: txt and num. I am preferring to use tidyr in this way,
df |> tidyr::separate_wider_regex(vec, 
                                   c(txt = "[A-Za-z]+", num = "\\d+"),
                                   too_few = "align_start")
# A tibble: 4 × 3
  id     txt   num  
  <chr>  <chr> <chr>
1 case1  One   NA   
2 case19 tWo   NA   
3 case88 Three NA   
4 case77 NA    NA  
However, it is not what I want. I have the following expection:
      id      txt num
1  case1   One_20  19
2 case19   tWo_20 290
3 case88 Three_38 399
4 case77     <NA>  NA
I am doing mistakes in the regex part. Any help to correct those mistakes so that I can have the expected table as output?
A way in base R using sub():
cbind(df['id'], {
  l = strsplit(sub('^(.*) \\((.*)\\)$', '\\1 \\2', df$vec), ' ')
  lapply(l, `length<-`, max(lengths(l))) |>
    do.call(what = 'rbind')
  }) |> setNames(c('id', 'txt', 'num'))
      id      txt  num
1  case1   One_20   19
2 case19   tWo_20  290
3 case88 Three_38  399
4 case77     <NA> <NA>
                        Try
> df %>%
+     separate_wider_regex(vec,
+         c(txt = "\\w+", "\\s+\\(", num = "\\d+","\\)"),
+         too_few = "align_start"
+     )
# A tibble: 4 × 3
  id     txt      num  
  <chr>  <chr>    <chr>
1 case1  One_20   19
2 case19 tWo_20   290
3 case88 Three_38 399
4 case77 NA       NA
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With