Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Manipulate variables in netcdf files and write them again

I have several netcdf files. each nc file has several variables. I am only interested in two variables "Soil_Moisture" and "Soil_Moisture_Dqx".

I would like to filter "Soil_Moisture" based on "Soil_Moisture_Dqx". I want to replace values in "Soil_Moisture" by NA whenever corresponding "Soil_Moisture_Dqx" pixels have values greater than 0.04.

:Here are the files to download:

1- I tried this loop but when I typed f[1] or f[2] I got something weird which means that my loop is incorrect.I am grateful to anyhelp to get my loop corrected.

 a<-list.files("C:\\3 nc files", "*.DBL", full.names = TRUE)

for(i in 1:length(a)){
f=open.ncdf(a[i])
A1 = get.var.ncdf(nc=f,varid="Soil_Moisture",verbose=TRUE)
A1* -0.000030518509475997 ## scale factor
 A2 = get.var.ncdf(nc=f,varid="Soil_Moisture_Dqx",verbose=TRUE)
A2*-0.0000152592547379985## scale factor
A1[A2>0.04]=NA ## here is main calculation I need
 }

2- Can anybody tell me to write them again?

like image 590
Jonsson Sali Avatar asked Sep 06 '25 04:09

Jonsson Sali


2 Answers

Missing values are special values in netCDF files whose value is to be taken as indicating the data is "missing". So you need to use set.missval.ncdf to set this values.

a<-list.files("C:\\3 nc files", "*.DBL", full.names = TRUE)

SM_NAME <- "Soil_Moisture"
SM_SDX_NAME <- "Soil_Moisture_Dqx"
library(ncdf)
lapply(a, function(filename){
  nc <- open.ncdf( filename,write=TRUE )
  SM <- get.var.ncdf(nc=nc,varid=SM_NAME)
  SM_dqx <- get.var.ncdf(nc=nc,varid=SM_SDX_NAME)
  SM[SM_dqx > 0.4] <- NA
  newMissVal <- 999.9
  set.missval.ncdf( nc, SM_NAME, newMissVal )
  put.var.ncdf( nc, SM_NAME, SM )
  close.ncdf(nc)
 })

EDIT add some check

It is intersting here to count how many points will tagged as missed.

Whithout applying the odd scale factor we have:

lapply(a, function(filename){
  nc <- open.ncdf( filename,write=TRUE )
  SM_dqx <- get.var.ncdf(nc=nc,varid=SM_SDX_NAME)
   table(SM_dqx > 0.4)
  })

[[1]]
[1] 810347     91

[[2]]
[1] 810286    152

[[3]]
[1] 810287    151

[[4]]
[1] 810355     83
like image 128
agstudy Avatar answered Sep 09 '25 03:09

agstudy


This can also be accomplished from the command line using CDO.

As I understand it both variables are contained in your input file (which I will call "datafile.nc", you will want to presumably do the following in a loop over the file lists), so first of all we will extract those two variables into two separate files:

cdo selvar,Soil_Moisture     datafile.nc soil_moisture.nc
cdo selvar,Soil_Moisture_Dqx datafile.nc dqx.nc

Now we will define a mask file that contains 1 when dqx<0.04 but contains NAN when dqx>=0.04

cdo setctomiss,0 -ltc,0.04 dqx.nc mask.nc

The ltc is "than than constant" (you may want instead lec for <= ), the setctomiss replaces all the zeros with NAN.

Now we multiply these together with CDO - NAN*C=NAN and 1*C=C, so this gives you a netcdf with your desired field:

cdo mul mask.c soil_moisture.nc masked_soil_moisture.nc 

you can actually combine those last two lines together if you like, and avoid the I/O of writing the mask file:

cdo mul -setctomiss,0 -ltc,0.04 dqx.nc soil_moisture.nc masked_soil_moisture.nc 

But it is easier to explain the steps separately :-)

You can put the whole thing in a loop over files easily in bash.

like image 23
Adrian Tompkins Avatar answered Sep 09 '25 01:09

Adrian Tompkins