I am in R. I want to extract just the numbers from df1.
I have for example:
df1 <- data.frame( column1 = c("Any[12, 15, 20]", "Any[22, 23, 30]"), column2 = c("Any[4, 17]", "Any[]"), stringsAsFactors = F )
And I want a new df, that takes the integers within the brackets muliples by the row number, and keeps the column information corresponding to it.
e.g. new_df could look like
Time | Channel |
---|---|
12 | column1 |
15 | column1 |
20 | column1 |
44 | column1 |
46 | column1 |
60 | column1 |
8 | column2 |
34 | column2 |
I do not need to preserve any "NA" values, e.g If Any[] is empty. Anyone got any idea if this is possible please? I have ENORMOUS amounts of data in this format, so I cannot really do much manually. Cheers!
I already tried:
new_df$Time <- as.integer(df1$column1)
and that just gave blanks.
I also tried:
new_df$Time <- str_extract_all(new_df$Time, "\\d+" ) %>% lapply(function(x) as.integer(x)) %>% sapply(function(x) if.else(length(x) >0, x, NA) )
which only then returned the first integer within each bracket. e.g.
Time | Channel |
---|---|
12 | column1 |
44 | column1 |
8 | column2 |
This should work. Note that parse_number
will issue a warning for rows with no numbers. You could wrap it in suppressWanings()
to silence it.
library(dplyr)
library(tidyr)
library(readr)
df1 |>
mutate(rn = row_number()) |>
pivot_longer(-rn, names_to = "channel", values_to = "time") |>
separate_longer_delim(time, delim = ",") |>
mutate(time = parse_number(time) * rn) |>
arrange(channel, rn) |>
select(-rn) |>
filter(!is.na(time))
# # A tibble: 8 × 2
# channel time
# <chr> <dbl>
# 1 column1 12
# 2 column1 15
# 3 column1 20
# 4 column1 44
# 5 column1 46
# 6 column1 60
# 7 column2 4
# 8 column2 17
# Warning message:
# There was 1 warning in `mutate()`.
# ℹ In argument: `time = parse_number(time)`.
# Caused by warning:
# ! 1 parsing failure.
# row col expected actual
# 9 -- a number Any[]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With