Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read multiple coordinates with xarray

I'm using xarray to read single point data from an openDAP server, then I convert the xarray object to dataframe. This works fine. I would like to read multiple points in a single call, but I don't which is the best approach to do so.

This is the code I'm using for a single point:

import pandas as pd
import xarray as xr

url = 'http://nomads.ncep.noaa.gov:9090/dods/gfs_0p25/gfs20161111/gfs_0p25_00z'
lats =  [40.1,40.5,42.3]
lons =  [1.02,1.24,1.84]
vars = ['dswrfsfc', 'tmp2m', 'pressfc']

ds = xr.open_dataset(url)

data_single  = ds.sel(lon=lons[0], lat=lats[0], method='nearest')    
ts_dataframe_single = data_single[vars].to_dataframe()

For reading multiple points I do:

data  = ds.sel(lon=lons, lat=lats, method='nearest')
ts_dataframe = data[vars].to_dataframe()

And this is the output of data.coords:

data.coords
Out[10]: 
Coordinates:
  * time     (time) datetime64[ns] 2016-11-11 2016-11-11T03:00:00 ...
  * lev      (lev) float64 1e+03 975.0 950.0 925.0 900.0 850.0 800.0 750.0 ...
  * lat      (lat) float64 40.0 40.5 42.25
  * lon      (lon) float64 1.0 1.25 1.75

When I convert to a dataframe the resulting object contains a mix of time and coordinates in the timestamp. This is how it looks:

dataframe containg multiple points

My questions are:

  • Is this the best way to retrieve multiple point with xarray?
  • How do I extract the data from a single point of the resulting dataframe?

Thanks in advance!

like image 980
Jordi Vidal de LLobatera Avatar asked Sep 03 '25 15:09

Jordi Vidal de LLobatera


2 Answers

I think you want sel_points instead of sel. So, something like this:

data  = ds.sel_points(lon=lons, lat=lats, method='nearest')
ts_dataframe = data[vars].to_dataframe()
like image 82
jhamman Avatar answered Sep 05 '25 16:09

jhamman


The memory issue you are seeing is that the output is a lat x lon grid of pixel values, when you are really only interested in the grid's diagonal entries, corresponding to each lat, lon coordinate pair. This is very computationally expensive because it is searching for pixel values of n^2 points instead of n points.

'sel_points()' is deprecated because it can be accomplished using sel/isel (see below).

Instead, you can to set lat and lon as their own DataArrays:

lats = xr.DataArray(lats, dims='z') #'z' is an arbitrary name placeholder
lons = xr.DataArray(lons, dims='z')
data = ds.sel(lat = lats, lon = lons, method = 'nearest')
#VariableName = 'lev', so you can also do: 
data = ds.lev.sel(lat = lats, lon = lons, method = 'nearest')

See this related post for more.

like image 37
Justina Pinch Avatar answered Sep 05 '25 15:09

Justina Pinch