Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to work with large .jld data files in Julia

I have some files of the .jld Julia file format which have multidimensional arrays. In my drive, the files totally take up around 60 GB. I want to concatenate some of them together using hcat() and then do further calculations and plots from these data files.

However, just to read these files either takes very long, or I get an "Out of Memory Error" so I'm not sure how to work with these files. I have 8 GB RAM in my device, and I am loading the data out of an external HDD (I also prepared the data from simulations and wrote them directly to this HDD but there were no errors then).

How do I deal with files this large?

like image 841
newtothis Avatar asked Oct 26 '25 23:10

newtothis


1 Answers

In order to work with data larger than your memory you need to use to have disk-based data structures. Julia supports this functionality via mmap (see https://docs.julialang.org/en/v1/stdlib/Mmap/)

Fortunately, a higher level interface is also available via SharedArrays (see https://docs.julialang.org/en/v1/stdlib/SharedArrays/):

Hence, you can do:

using SharedArrays
a = SharedArray{Float64}("c:\\temp\\file.dat", (100,100,100))

Now you have a disk-backed array. You can copy your JLD data into it and perform the aggregation.

like image 110
Przemyslaw Szufel Avatar answered Oct 29 '25 14:10

Przemyslaw Szufel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!