I have a program that wants to be called from the command line many times, but involves reading a large pickle file, so each call can be potentially expensive. Is there any way that I can make cPickle just mmap the file into memory rather than read it in its entirety?
You probably don't even need to do this explicitly as your OS's disk cache will probably do a damn good job already.
Any poor performance might actually be related to the cost of deserialization and not the cost of reading it off the disk. You can test this by creating a temporary ram disk and putting the file there.
And the way to remove the cost of deserialization is to move the loading of the file to a separate python process and call it like a service. Building a quick-and-dirty REST service in python is super-easy and super-useful in these cases.
Take a look at the socket docs for how to do this with a raw socket. The echo server is a good example to start from.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With