How can I tidy up the following data frame
data.frame(a = c(1,2), values = c("[1.1, 1.2, 1.3]", "[2.1, 2.2]"))
a values
1 [1.1, 1.2, 1.3]
2 [2.1, 2.2]
The result should be
data.frame(a = c(1,1,1,2,2), values = c(1.1, 1.2, 1.3, 2.1, 2.2))
a values
1 1.1
1 1.2
1 1.3
2 2.1
2 2.2
We may extract the numbers with str_extract_all in a list and unnest
library(dplyr)
library(stringr)
library(tidyr)
df1 %>%
mutate(values = str_extract_all(values, "[0-9.]+")) %>%
unnest(values) %>%
type.convert(as.is = TRUE)
-output
# A tibble: 5 × 2
a values
<int> <dbl>
1 1 1.1
2 1 1.2
3 1 1.3
4 2 2.1
5 2 2.2
Or another option is to evaluate the python object with reticulate:py_eval and then unnest the list column
library(reticulate)
df1 %>%
rowwise %>%
mutate(values = list(py_eval(values))) %>%
unnest(values)
-output
# A tibble: 5 × 2
a values
<dbl> <dbl>
1 1 1.1
2 1 1.2
3 1 1.3
4 2 2.1
5 2 2.2
df1 <- data.frame(a = c(1,2), values = c("[1.1, 1.2, 1.3]", "[2.1, 2.2]"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With