I am a bit frustrated about not being able to solve this seemingly simple problem:
I have a function that takes some time to load data:
def import_data(id):
time.sleep(5)
return 'data' + str(id)
A DataModel class calls this function and manages two datasets.
class DataModel():
def __init__(self):
self._data_1 = import_data(1)
self._data_2 = import_data(2)
def retrieve_data_1(self):
return self._data_1
def retrieve_data_2(self):
return self._data_2
Now, the main UI creates the DataModel, calling both import_data functions, which blocks it.
def main_ui():
# This takes 5 seconds for each dataset and blocks the main UI thread
dm = DataModel()
# Other stuff is happening. This time could be used to load data in the background
time.sleep(2)
# Retrieve the first dataset
data_1 = dm.retrieve_data_1()
# User interaction. This time could be used to load even larger datasets
time.sleep(10)
# Retrieve the second dataset
data_2 = dm.retrieve_data_2()
I want the datasets to be loaded in the background to reduce the time the UI is blocked. My idea would be to implement it like this pseudocode:
class DataModel():
def __init__(self):
self._data_1 = Thread(import_data(1)).start()
self._data_2 = Thread(import_data(2)).start()
def retrieve_data_1(self):
return self._data_1.wait_for_result()
def retrieve_data_2(self):
return self._data_2.wait_for_result()
The import_data functions are called in separate threads and return Future objects.
The retrieve_data functions either block the main thread waiting for the Future to evaluate or return its result instantly.
Is there an easy way to implement this in Python 3.x with threading and/or asyncio? Thanks in advance!
(Edit: syntax correction)
Use the concurrent.futures module which is designed exactly for that kind of usage:
_pool = concurrent.futures.ThreadPoolExecutor()
class DataModel():
def __init__(self):
self._data_1 = _pool.submit(import_data, 1)
self._data_2 = _pool.submit(import_data, 2)
def retrieve_data_1(self):
return self._data_1.result()
def retrieve_data_2(self):
return self._data_2.result()
If your functions are global, and your data serializable, you can even seamlessly switch from ThreadPoolExecutor to ProcessPoolExecutor and benefit from true (process-based) parallelism.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With