I'm using xarray to read single point data from an openDAP server, then I convert the xarray object to dataframe. This works fine. I would like to read multiple points in a single call, but I don't which is the best approach to do so.
This is the code I'm using for a single point:
import pandas as pd
import xarray as xr
url = 'http://nomads.ncep.noaa.gov:9090/dods/gfs_0p25/gfs20161111/gfs_0p25_00z'
lats = [40.1,40.5,42.3]
lons = [1.02,1.24,1.84]
vars = ['dswrfsfc', 'tmp2m', 'pressfc']
ds = xr.open_dataset(url)
data_single = ds.sel(lon=lons[0], lat=lats[0], method='nearest')
ts_dataframe_single = data_single[vars].to_dataframe()
For reading multiple points I do:
data = ds.sel(lon=lons, lat=lats, method='nearest')
ts_dataframe = data[vars].to_dataframe()
And this is the output of data.coords
:
data.coords
Out[10]:
Coordinates:
* time (time) datetime64[ns] 2016-11-11 2016-11-11T03:00:00 ...
* lev (lev) float64 1e+03 975.0 950.0 925.0 900.0 850.0 800.0 750.0 ...
* lat (lat) float64 40.0 40.5 42.25
* lon (lon) float64 1.0 1.25 1.75
When I convert to a dataframe the resulting object contains a mix of time and coordinates in the timestamp. This is how it looks:
My questions are:
Thanks in advance!
I think you want sel_points
instead of sel
. So, something like this:
data = ds.sel_points(lon=lons, lat=lats, method='nearest')
ts_dataframe = data[vars].to_dataframe()
The memory issue you are seeing is that the output is a lat x lon grid of pixel values, when you are really only interested in the grid's diagonal entries, corresponding to each lat, lon coordinate pair. This is very computationally expensive because it is searching for pixel values of n^2 points instead of n points.
'sel_points()' is deprecated because it can be accomplished using sel/isel (see below).
Instead, you can to set lat and lon as their own DataArrays:
lats = xr.DataArray(lats, dims='z') #'z' is an arbitrary name placeholder
lons = xr.DataArray(lons, dims='z')
data = ds.sel(lat = lats, lon = lons, method = 'nearest')
#VariableName = 'lev', so you can also do:
data = ds.lev.sel(lat = lats, lon = lons, method = 'nearest')
See this related post for more.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With