I have 10000s custom (compiled to '.so') modules that I'd like to use in python. The usage of the modules will be consequential (modules are used one after the other; not at the same time). 
Normally, the code would look something like this:
# list with all the paths to all modules  
listPathsToModules = [.....]
# loop through the list of all modules 
for i in xrange(listPathsToModules):
    # get the path to the currently processed module 
    pathToModule = listPathsToModules[i]
    # import the module
    import pathToModule
    # run a function in 'pathToModule' and get the results
    pathToModule.MyFunction( arg1, arg2, arg3 )
Running this, here is what I find:
the avg. time it takes to import one module: 0.0024625 [sec]
the avg. time it takes to run the module's function: 1.63727e-05 [sec]
meaning, it takes x100 more time to import the module than to run a function that is in it!
Is there anything that can be done to speed-up the time it takes to load a module in python? What steps would you take to optimize this situation given the need to load and run many (assume 10,000s) of modules?
I would first question whether import is really the technique you want to be using to access thousands of code fragments - the full import process is quite expensive, and loading (non-shared) dynamic modules at all isn't particularly cheap, either.
Second, the code as you have written it clearly isn't what you're actually doing. The import statement doesn't accept strings at runtime, you would have to be using importlib.import_module() or calling __import__() directly.
Finally, the first step in optimising this would be to ensure that the first directory on sys.path is the one that contains all these files. You may also want to run Python with the -vv flag to dial up the verbosity on the import attempts. Be warned that this is going to get very noisy if you're doing that many imports.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With