I have a single stream of data that must be processed as quickly as possible. The single stream contains data from up to 200 sources. Not all the sources produce the same amount of data and the rate can vary.
As an initial attempt I decided to create 10 (sort of based on the server spec, dual quad core), long running Tasks. Each Task would read from a BlockCollection. Before starting I created a map so that as data is received on the inbound stream I know which BlockingCollection to add that sources data to.
The problem, I think, is that I don't know upfront which source will produce the most data and indeed this can change over time I saw that some collections were very empty, while others were receiving many more updates.
If I have 8 hardware threads available and I've created about 10 queues and Tasks aren't bound to a thread (again not sure if this is true with TaskCreationOptions.LongRunning), then even if one queue is not busy the other busy queue can't make use of the spare thread as in theory I could end up processing a piece of data out of sequence.
Would I be better just creating a Task and Blocking collection for each source, then the TPL can make best use of the threads available as the data is at its most segregated?
My other alternative was to somehow workout on past stats and various external/human info how best to spread the sources amongst a finite set of BlockingCollections/Tasks and then adjust the mapping over time.
I hope I've explained my scenario well enough.
I'm using a class that encapsulates the BlockingCollection and Task
I have what could be visualised as 40+ streams interlaced which if split be processed at the same time (as long as each stream is kept in it's own sequence), but there are many more streams than available hardware threads.
To try and clarify what I'm looking for. I'm currently spliting sources into sub groups, effectively, and allocating each group it's own queue. My question is really: How many groups to create? If I have 200 sources, should I create 200 groups (which is then 200 Tasks and Blocking collections) and then let the TPL run around like a mad man allocating threads where it can as each Task gets it's cpu time. Or am I better off allocating 1 group per underlying hardware thread?
I would personally leverage TPL Dataflow here and just define an ActionBlock<T> that reprsents your work and link a BufferBlock<T> "in front" of it to prevent over saturation by the various producers. Then all you do is post to the BufferBlock<T> from your various sources (producers) and make sure you've load tested/configured your block options (BoundedCapacity, MaxDegreeOfParallelism, MaxMessagesPerTask, etc.) accordingly and let TPL Dataflow work its magic. Takes all the heavy lifting out of your hands.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With