I am currently doing a large amount of data analysis in Fortran. I have been using R to plot most of my results, as Fortran is ill-suited for visualization. Up until now, the data sets have been two-dimensional and rather small, so I've gotten away with routines that write the data-to-be-plotted and various plot parameters to a .CSV file, and using a system call to run an R script that reads the file and generates the required plot.
However, I find myself now dealing with somewhat larger 3D data sets, and I do not know if I can feasibly continue in this manner (notably, sending and properly reading in a 3D array via .CSV is rather more difficult, and takes up a lot of excess memory which is a problem given the size of the data sets).
Does anyone know any efficient way of sending data from Fortran to R? The only utility I found for this (RFortran) is windows-only, and my work computer is a mac. I know that R possesses a rudimentary Fortran interface, but I am calling R from Fortran, not vice-versa, and moreover given the number of plot parameters I am sending (axis lables, plot titles, axis units and limits, etc., many of which are optional and have default values in the current routines I'm using) I am not sure that it has the features I require.
I would go for writing NetCDF files from Fortran. These files can contain large amounts of multi-dimensional data. There are also good bindings for creating NetCDF files form within Fortran (it is used a lot in climate models). In addition, R has excellent support for working with NetCDF files in the form of the ncdf package. It is for example very easy to only read a small portion of the data cube into memory (only some timesteps, or some geographic region). Finally, NetCDF works across all platforms.
In terms of workflow, I would let the fortran program generate NetCDF files plus some graphics parameters in a separate file (data.nc and data.plt for example), and then as a post-processing step call R. In this way you do not need to directly interface R and Fortran. Managing the entire workflow could be done by a separate script (e.g. Python), which calls the Fortran model, makes a list of the NetCDF/.plt files and creates the plots.
So, it turns out that sending arrays via. unformatted files between Fortran and R is trivially easy. Both are column-major, so one needs to do no more than pass an unformatted file containing the array and another containing array shape and size information, and then read the data directly into an array of proper size and shape in R.
Sample code for an n-dimensional array of integers, a, with dimension i having size s(i).
Fortran-side (access must be set to "stream," else you will have extra bytes inserted after every write):
open(unit = 1, file="testheader.dat", form="unformatted", access="stream", status="unknown")
open(unit = 2, file="testdata.dat", form="unformatted", access="stream", status="unknown")
write(1) n
do i=1,n
write(1) s(i)
enddo
write(2) a
R-side (be sure that you have endianness correct, or this will fail miserably):
testheader = file("testheader.dat", "rb")
testdata = file("testdata.dat", "rb")
dims <- readBin(testheader, integer(), endian="big")
sizes <- readBin(testheader, integer(), n=dims, endian="big")
dim(sizes) <- c(dims)
a <- readBin(testdata, integer(), n=prod(sizes), endian="big")
dim(a) <- sizes
You can put the header and data in the same file if you want.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With