I have been using the following script which worked well for many years but an update of the XML package in R I think rendered what I had to only partially work.
download.file("http://forecast.weather.gov/MapClick.php?lat=47.9733&lon=-121.6413&FcstType=digitalDWML","temp.xml") #This is for the lake surface
data1 <- xmlParse("temp.xml")
temp_path <- "//temperature[@type='hourly']/value"
precip_path <- "//hourly-qpf[@type='floating']/value"
df1 <- data.frame(
latitude=data1[["number(//point/@latitude)"]],
longitude=data1[["number(//point/@longitude)"]],
hourly_temperature=as.integer(sapply(data1[temp_path], as, "integer"))
hourly_precip=as.numeric(sapply(data1[precip_path],as,"double"))
)
df1$date1<- seq(Sys.time(), by="hour", length.out = length(df1$hourly_temperature))
The hourly temperature part parses correctly but the precipitation part does not parse correctly anymore. I have tried a number of different options but apparently I am bad with XML. Any help would be really appreciated!
As commented, there is no issue with XML package but the XML data. Precipitation data contains missing values at the end of its section where your attempted sapply + as + "numeric" iterative call errs out in type conversion.
<value xsi:nil="true"/>
<value xsi:nil="true"/>
<value xsi:nil="true"/>
<value xsi:nil="true"/>
</hourly-qpf>
Instead, consider using XML's xpathSApply + xmlValue to properly handle type conversion of original text content. (Also no need to download.file but use readLines on URL).
library(XML)
url <- "http://forecast.weather.gov/MapClick.php?lat=47.9733&lon=-121.6413&FcstType=digitalDWML"
data1 <- xmlParse(readLines(url))
temp_path <- "//temperature[@type='hourly']/value"
precip_path <- "//hourly-qpf[@type='floating']/value"
df1 <- transform(
data.frame(
latitude=data1[["number(//point/@latitude)"]],
longitude=data1[["number(//point/@longitude)"]],
hourly_temperature=as.integer(xpathSApply(data1, temp_path, xmlValue)),
hourly_precip= as.numeric(xpathSApply(data1, precip_path, xmlValue))
),
date1 = seq(Sys.time(), by="hour", length.out = length(hourly_temperature))
)
Output
str(df1)
# 'data.frame': 168 obs. of 5 variables:
# $ latitude : num 48 48 48 48 48 ...
# $ longitude : num -122 -122 -122 -122 -122 ...
# $ hourly_temperature: int 26 28 30 32 32 34 35 35 35 35 ...
# $ hourly_precip : num 0 0 0 0 0 0 0 0 0 0 ...
# $ date1 : POSIXct, format: "2021-04-11 11:15:00" "2021-04-11 12:15:00" "2021-04-11 13:15:00" "2021-04-11 14:15:00" ...
head(df1)
# latitude longitude hourly_temperature hourly_precip date1
# 1 47.98 -121.64 26 0 2021-04-11 11:15:00
# 2 47.98 -121.64 28 0 2021-04-11 12:15:00
# 3 47.98 -121.64 30 0 2021-04-11 13:15:00
# 4 47.98 -121.64 32 0 2021-04-11 14:15:00
# 5 47.98 -121.64 32 0 2021-04-11 15:15:00
# 6 47.98 -121.64 34 0 2021-04-11 16:15:00
tail(df1)
# latitude longitude hourly_temperature hourly_precip date1
# 163 47.98 -121.64 39 0 2021-04-18 05:15:00
# 164 47.98 -121.64 38 0 2021-04-18 06:15:00
# 165 47.98 -121.64 38 NA 2021-04-18 07:15:00
# 166 47.98 -121.64 39 NA 2021-04-18 08:15:00
# 167 47.98 -121.64 41 NA 2021-04-18 09:15:00
# 168 47.98 -121.64 43 NA 2021-04-18 10:15:00
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With