here is the problem:
1) suppose that I have some measure data (like 1Msample read from my electronics) and I need to process them by a processing chain.
2) this processing chain consists of different operations, which can be swapped/omitted/have different parameters. A typical example would be to take this data, first pass them via a lookup table, then do exponential fit, then multiply by some calibration factors
3) now, as I do not know what algorithm the the best, I'd like to evaluate at each stage best possible implementation (as an example, the LUTs can be produced by 5 ways and I want to see which one is the best)
4) i'd like to daisychain those functions such, that I would construct a 'class' containing top-level algorithm and owning (i.e. pointing) to child class, containing lower-level algorithm.
I was thinking to use double-linked-list and generate sequence like:
myCaptureClass.addDataTreatment(pmCalibrationFactor(opt, pmExponentialFit (opt, pmLUT (opt))))
where myCaptureClass is the class responsible for datataking and it should as well (after the data being taken) trigger the top-level data processing module (pm). This processing would first go deep into the bottom-child (lut), treat data there, then middle (expofit), then top (califactors) and return the data to the capture, which would return the data to the requestor.
Now this has several issues:
1) everywhere on the net is said that in python one should not use double-linked-lists 2) this seems to me highly inefficient because the data vectors are huge, hence i would prefer solution using generator function, but i'm not sure how to provide the 'plugin-like' mechanism.
could someone give me a hint how to solve this using 'plugin-style' and generator so I do not need to process vector of X megabytes of data and process them 'on-request' as is correct when using generator function?
thanks a lot
david
An addendum to the problem:
it seems that I did not express myself exactly. Hence: the data are generated by an external HW card plugged into VME crate. They are 'fetched' in a single block transfer to the python tuple, which is stored in myCaptureClass.
The set of operation to be applied is in fact on a stream data, represented by this tuple. Even exponential fit is stream operation (it is a set of variable state filters applied on each sample).
The parameter 'opt' i've mistakenly shown was to express, that each of those data processing classes has some configuration data which come with, and modify behaviour of the method used to operate on data.
The goal is to introduce into myCaptureClass a daisychained class (rather than function), which - when user asks for data - us used to process 'raw' data into final form.
In order to 'save' memory resources i thought it might be a good idea to use generator function to provide the data.
from this perspective it seems that the closest match to what i want to do is shown in code of bukzor. I'd prefer to have a class implementation instead of function, but i guess this is just a cosmetic stuff of implementing call operator in particular class, which realizes the data operation....
This is how I imagine you would do this. I expect this is incomplete, since I don't fully understand your problem statement. Please let me know what I've done wrong :)
class ProcessingPipeline(object):
def __init__(self, *functions, **kwargs):
self.functions = functions
self.data = kwargs.get('data')
def __call__(self, data):
return ProcessingPipeline(*self.functions, data=data)
def __iter__(self):
data = self.data
for func in self.functions:
data = func(data)
return data
# a few (very simple) operators, of different kinds
class Multiplier(object):
def __init__(self, by):
self.by = by
def __call__(self, data):
for x in data:
yield x * self.by
def add(data, y):
for x in data:
yield x + y
from functools import partial
by2 = Multiplier(by=2)
sub1 = partial(add, y=-1)
square = lambda data: ( x*x for x in data )
pp = ProcessingPipeline(square, sub1, by2)
print list(pp(range(10)))
print list(pp(range(-3, 4)))
Output:
$ python how-to-implement-daisychaining-of-pluggable-function-in-python.py
[-2, 0, 6, 16, 30, 48, 70, 96, 126, 160]
[16, 6, 0, -2, 0, 6, 16]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With