Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Separating a single column into 3 columns, keeping original column (R)

Tags:

r

I have a unique character column which is in "DD HH:MM" I am trying to separate the column into 3 new numeric columns. My issue with the separate function is that is replaces the original column and I can't figure out how to pass multiple criteria through SEP =

Sample of an observation in "DD HH:MM" format

Please note that not every observation has "DD"

My current code looks like this:

separate(numeric_report,Planned.Duration..dd.hh.mm., into = c("Planned Days" ,"Planned Hours" , "Planned Minutes"),sep = ":")

Any advice would be greatly appreciated!

like image 992
Scott217 Avatar asked Oct 14 '25 15:10

Scott217


2 Answers

You can use remove = FALSE to stop from dropping original columns and you can use pass multiple sep argument using |.

tidyr::separate(data, Planned.Duration..dd.hh.mm., 
                 into = c("Planned Days" ,"Planned Hours" , "Planned Minutes"), 
                 sep = ":|\\s", remove = FALSE, fill = "left", convert = TRUE)


#      Planned.Duration..dd.hh.mm. Planned Days Planned Hours Planned Minutes
#1                        1:03           NA             1               3
#2                     2 00:38            2             0              38
#3                        0:49           NA             0              49

data

data <- data.frame(Planned.Duration..dd.hh.mm. = 
         c('1:03', '2 00:38','0:49'), stringsAsFactors = FALSE)
like image 52
Ronak Shah Avatar answered Oct 17 '25 07:10

Ronak Shah


We can use separate

library(dplyr)
library(tidyr)
data %>% 
    separate( Planned.Duration..dd.hh.mm., 
             into = c("Planned Days" ,"Planned Hours" , "Planned Minutes"), 
             sep = "[: ]+", remove = FALSE, fill = "left", convert = TRUE)

#  Planned.Duration..dd.hh.mm. Planned Days Planned Hours Planned Minutes
#1                        1:03           NA             1               3
#2                     2 00:38            2             0              38
#3                        0:49           NA             0              49

We can also use extract

data %>% 
    tidyr::extract(Planned.Duration..dd.hh.mm., 
       into = c("Planned Days" ,"Planned Hours" , "Planned Minutes"), 
            "^(\\d+)? ?(\\d+):(\\d+)$", remove = FALSE, convert = TRUE)

#   Planned.Duration..dd.hh.mm. Planned Days Planned Hours Planned Minutes
#1                        1:03           NA             1               3
#2                     2 00:38            2             0              38
#3                        0:49           NA             0              49

data

data <- data.frame(Planned.Duration..dd.hh.mm. = 
         c('1:03', '2 00:38','0:49'), stringsAsFactors = FALSE)
like image 27
akrun Avatar answered Oct 17 '25 08:10

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!