So I am trying to index a NetCDF file to get stream flow rate data in a certain grid cell. The NetCDF file I am using has the following characteristics:
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF3_CLASSIC data model, file format NETCDF3):
CDI: Climate Data Interface version 1.6.4 (http://code.zmaw.de/projects/cdi)
Conventions: CF-1.4
dimensions(sizes): lon(3600), lat(1800), time(31)
variables(dimensions): float64 lon(lon), float64 lat(lat), float64 time(time), float32 dis(time,lat,lon)
I have 35+ years of this data and I am trying to get the data from an individual grid and create a time-series to compare it do a different model's forecasts. The code I am currently using to extract data from a grid cell is below.
from netCDF4 import Dataset
import numpy as np
root_grp = Dataset(r'C:\Users\wadear\Desktop\ERAIland_daily_dis_198001.nc')
dis = root_grp.variables['dis']
lat = np.round(root_grp.variables['lat'][:], decimals=2).tolist()
lon = np.round(root_grp.variables['lon'][:], decimals=2).tolist()
time = root_grp.variables['time'].shape[0]
lat_index = lat.index(27.95)
lon_index = lon.index(83.55)
for i in range(time):
    print(dis[i][lat_index][lon_index])
Right now this feels really slow, and it will take a long time to do this over a 35+ year timespan, and while doing multiple different grid cells, the time it takes will really build up.
Is there a tool to speed up this process with faster I/O or indexing?
Thanks!
You should get a big time saving if you remove the loop over time and access the entire time series at once, i.e.
dis[:,lat_index,lon_index]
Further speed gains can be obtained if you apply chunking in the time dimension. Look up the documentation for nccopy. If you need to access the time series repeatedly, this is worth doing. You may wish to concatenate some of your NetCDF files before chunking, e.g. monthly -> annual. This is done using ncrcat utility.
See also Chunking Data: Why it Matters.
why not simply extract the point with CDO first and then read in the point data:
cdo remapnn,lon=83.55/lat=27.95 input.nc point_output.nc
on ubuntu if you don't have CDO installed, you can install it with
sudo apt-get install cdo 
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With