I have many files with the format
datetime variable value
Date Time name xxx.y
e.g.
datetime variable value
2011-01-01 01:00:00 precipitation 10.0
2011-01-01 02:00:00 precipitation 5.0
2011-01-01 03:00:00 precipitation 0.0
I am using readr::read_table
to read them.
However the problem is that the header has three columns but the data has four.
For instance, reading the data
ex_txt="datetime variable value
2011-01-01 01:00:00 precipitation 10.0
2011-01-01 02:00:00 precipitation 5.0
2011-01-01 03:00:00 precipitation 0.0"
with
read_table(ex_txt)
returns get an error since read_table
finds three labels in the first row and then data rows have four columns.
There is an obvious work-around to skip the first row, and then use a sequence of pipes to get the data in the correct form.
For instance the following solves my problem
read_table(ex_txt,skip=1,col_names = c("date","time","variable","value")) %>%
mutate(datetime=lubridate::ydm_hms(paste(date,time))) %>%
select(datetime,variable,value)
However, as I don't like the skip=1
part, I am curious to know if anyone has a way to make read_table
read the three headers and then combine the first two columns (Date & Time)
into one DateTime column. I have been experimenting with col_types but to no avail.
Thanks
Halldór
An option using readr::read_fwf
, since the datetime at the beginning should always be the same length, ending at position 19. Position 21 should then be the beginning of the rest of the line.
library(readr)
library(tidyr)
read_fwf(ex_txt, fwf_cols(datetime = 19, X = 21), skip = 1) %>%
separate_wider_delim(X, " ", names = c("variable", "value"))
# A tibble: 3 × 3
datetime variable value
<dttm> <chr> <chr>
1 2011-01-01 01:00:00 precipitation 10.0
2 2011-01-01 02:00:00 precipitation 5.0
3 2011-01-01 03:00:00 precipitation 0.0
With base R try surrounding the space delimited text with '
ex_txt <- "datetime variable value
'2011-01-01 01:00:00' precipitation 10.0
'2011-01-01 02:00:00' precipitation 5.0
'2011-01-01 03:00:00' precipitation 0.0"
and reading with argument text
read.table(text = ex_txt, header = T)
datetime variable value
1 2011-01-01 01:00:00 precipitation 10
2 2011-01-01 02:00:00 precipitation 5
3 2011-01-01 03:00:00 precipitation 0
Note that you still have to do things like as.POSIXct
for column datetime
In case the text can't be manipulated easily by hand try gsub
, e.g.
ex_txt <- "datetime variable value
2011-01-01 01:00:00 precipitation 10.0
2011-01-01 02:00:00 precipitation 5.0
2011-01-01 03:00:00 precipitation 0.0"
read.table(text = gsub(":(\\d{2}) ", ":\\1' ",
gsub("\n20", "\n'20", ex_txt)), header = T)
datetime variable value
1 2011-01-01 01:00:00 precipitation 10
2 2011-01-01 02:00:00 precipitation 5
3 2011-01-01 03:00:00 precipitation 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With