Separting alphanumeric string using tidyr separate wider regex

Question

I have the following data,

id <- c("case1", "case19", "case88", "case77")
vec <- c("One_20 (19)",
         "tWo_20 (290)",
         "Three_38 (399)",
         NA)

df <- data.frame(id, vec)

> df
      id            vec
1  case1    One_20 (19)
2 case19   tWo_20 (290)
3 case88 Three_38 (399)
4 case77           <NA>

I want to separte the vec vector into two variables, namely: txt and num. I am preferring to use tidyr in this way,

df |> tidyr::separate_wider_regex(vec, 
                                   c(txt = "[A-Za-z]+", num = "\d+"),
                                   too_few = "align_start")
# A tibble: 4 × 3
  id     txt   num  
  <chr>  <chr> <chr>
1 case1  One   NA   
2 case19 tWo   NA   
3 case88 Three NA   
4 case77 NA    NA

However, it is not what I want. I have the following expection:

      id      txt num
1  case1   One_20  19
2 case19   tWo_20 290
3 case88 Three_38 399
4 case77     <NA>  NA

I am doing mistakes in the regex part. Any help to correct those mistakes so that I can have the expected table as output?

Friede · Accepted Answer

A way in base R using sub():

cbind(df['id'], {
  l = strsplit(sub('^(.*) $(.*)$$', '\1 \2', df$vec), ' ')
  lapply(l, `length<-`, max(lengths(l))) |>
    do.call(what = 'rbind')
  }) |> setNames(c('id', 'txt', 'num'))

      id      txt  num
1  case1   One_20   19
2 case19   tWo_20  290
3 case88 Three_38  399
4 case77     <NA> <NA>

ThomasIsCoding · Answer

Try

> df %>%
+     separate_wider_regex(vec,
+         c(txt = "\w+", "\s+$", num = "\d+","$"),
+         too_few = "align_start"
+     )
# A tibble: 4 × 3
  id     txt      num  
  <chr>  <chr>    <chr>
1 case1  One_20   19
2 case19 tWo_20   290
3 case88 Three_38 399
4 case77 NA       NA

Separting alphanumeric string using tidyr separate wider regex

Tags:

string

regex

dataframe

r

tidyr

JontroPothon

2 Answers

Friede

ThomasIsCoding

Recent Activity

Donate For Us

Separting alphanumeric string using tidyr separate wider regex

Tags:

string

regex

dataframe

r

tidyr

JontroPothon

2 Answers

Friede

ThomasIsCoding

Related questions

Recent Activity

Donate For Us