Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tidying arrays of numbers in R data frame

Tags:

r

tidyr

tidyverse

How can I tidy up the following data frame

data.frame(a  = c(1,2), values = c("[1.1, 1.2, 1.3]", "[2.1, 2.2]"))

 a          values
1 [1.1, 1.2, 1.3]
2      [2.1, 2.2]

The result should be

data.frame(a  = c(1,1,1,2,2), values = c(1.1, 1.2, 1.3, 2.1, 2.2))
  a values
 1    1.1
 1    1.2
 1    1.3
 2    2.1
 2    2.2
like image 995
Sasha Avatar asked Oct 25 '25 03:10

Sasha


1 Answers

We may extract the numbers with str_extract_all in a list and unnest

library(dplyr)
library(stringr)
library(tidyr)
df1 %>%
    mutate(values = str_extract_all(values, "[0-9.]+")) %>% 
    unnest(values) %>% 
    type.convert(as.is = TRUE)

-output

# A tibble: 5 × 2
      a values
  <int>  <dbl>
1     1    1.1
2     1    1.2
3     1    1.3
4     2    2.1
5     2    2.2

Or another option is to evaluate the python object with reticulate:py_eval and then unnest the list column

library(reticulate)
df1 %>%
    rowwise %>%
     mutate(values = list(py_eval(values))) %>%
     unnest(values)

-output

# A tibble: 5 × 2
      a values
  <dbl>  <dbl>
1     1    1.1
2     1    1.2
3     1    1.3
4     2    2.1
5     2    2.2

data

df1 <- data.frame(a  = c(1,2), values = c("[1.1, 1.2, 1.3]", "[2.1, 2.2]"))
like image 75
akrun Avatar answered Oct 26 '25 17:10

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!