I just had to write a program in which I have to do matrix multiplication using threads, where there's a thread for every multiplication.
Now i'm wondering a few things, Are there really any advantages to using threads for multiplying a 3x2 matrix and a 2x3 matrix? for something small, sequential code is still efficient? If i'm wrong are there any advantages or disadvantages to something so small? I just see the complication too great for something so small.
On the other hand, would having a 10000x10000 matrix have a benefit in using threads? I would assume so, locality comes into play, but I'm still wrapping my head around when multithreading is more efficient, or not.
Thanks!
Generally, you never want to update values from same cache lines by multiple threads, that would kill performance. You also want to utilize SIMD units within threads. Both are typically achieved due to some kind of processing data in blocks (look for register blocking / cache blocking terms). Also, ideally, you want to create just as many threads as the hardware concurrency is (to prevent expensive context switching). For data parallelism (such as matrix multiplication), this is easier. For task parallelism, thread pools are typically employed.
For small matrices like 3x2, multithreading would be definitely much much slower than sequential processing. For larger matrices, you need to measure to find out the threshold where multithreading will be faster. That threshold depends on too many parameters to provide generic answer.
Also, I don't understand what do you mean by
there's a thread for every multiplication
Do you want to create a single thread for every multiplication of 2 scalars? This would create zillion of threads for large matrices, which would be terribly slow.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With