So i have a time series with the following format:
| Date(numeric, %Y.%m) | value |
|---|---|
| 1951.01 | 12 |
| 1951.02 | 13 |
I'm trying to separate the date column into two columns like that:
| year | month | value |
|---|---|---|
| 1951 | 01 | 12 |
| 1951 | 02 | 13 |
I've tried using the separate() function from tidyr and it kinda worked. However, for some reason it is dropping the 0 at month 10, like so:
data$month
... 8 9 1 11 ...
I suspect it has something to do with coercing it to character(?). I've tried using substr() and it didnt work either, same problem. Is there a better way of doing this?
My code:
data %>%
separate(Date, into = c("year","month"))
** Edit
I think It is definitely because I'm coercing the numeric date to a character.
as.character(1951.10)
[1] "1951.1"
Reproducible sample data:
df <- structure(list(Date = c(1951.01, 1951.02, 1951.1),
value = c(12,13, 14)), row.names = c(NA, -3L),
class = c("tbl_df", "tbl","data.frame"))
If you have numeric values in column Date, you should first to convert it into character and preserve two decimals. Here you can use sprintf to make it. Then you split the strings by ..
Try the code below
df %>%
mutate(Date = sprintf("%.2f", Date)) %>%
separate(Date, c("Year", "Month"), "\\.")
which gives
Year Month value
<chr> <chr> <dbl>
1 1951 01 12
2 1951 02 13
3 1951 10 14
Data
> dput(df)
structure(list(Date = c(1951.01, 1951.02, 1951.1), value = c(12,
13, 14)), row.names = c(NA, -3L), class = c("tbl_df", "tbl",
"data.frame"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With