Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should I spawn threads for parallel computation?

Today, I got into multi-threading. Since it's a new concept, I thought I could begin to learn by translating a simple iteration to a parallelized one. But, I think I got stuck before I even began.

Initially, my loop looked something like this:

let stuff: Vec<u8> = items.into_iter().map(|item| {
    some_item_worker(&item)
}).collect();

I had put a reasonably large amount of stuff into items and it took about 0.05 seconds to finish the computation. So, I was really excited to see the time reduction once I successfully implemented multi-threading!

When I used threads, I got into trouble, probably due to my bad reasoning.

use std::thread;

let threads: Vec<_> = items.into_iter().map(|item| {
    thread::spawn(move || {
        some_item_worker(&item)
    })
}).collect(); // yeah, this is followed by another iter() that unwraps the values

I have a quad-core CPU, which means that I can run only up to 4 threads concurrently. I guessed that it worked this way: once the iterator starts, threads are spawned. Whenever a thread ends, another thread begins, so that at any given time, 4 threads run concurrently.

The result was that it took (after some re-runs) ~0.2 seconds to finish the same computation. Clearly, there's no parallel computing going on here. I don't know why the time increased by 4 times, but I'm sure that I've misunderstood something.

Since this isn't the right way, how should I go about modifying the program so that the threads execute concurrently?

EDIT:

I'm sorry, I was wrong about that ~0.2 seconds. I woke up and tried it again, when I noticed that the usual iteration ran for 2 seconds. It turned out that some process had been consuming the memory wildly. When I rebooted my system and tried the threaded iteration again, it ran for about 0.07 seconds. Here are some timings for each run.

Actual iteration (first one):

0.0553760528564 seconds
0.0539519786835 seconds
0.0564560890198 seconds

Threaded one:

0.0734670162201 seconds
0.0727820396423 seconds
0.0719120502472 seconds

I agree that the threads are indeed running concurrently, but it seems to consume another 20 ms to finish the job. My actual goal was to utilize my processor to run threads parallel and finish the job soon. Is this gonna be complicated? What should I do to make those threads run in parallel, not concurrent?


1 Answers

I have a quad-core CPU, which means that I can run only up to 4 threads concurrently.

Only 4 may be running concurrently, but you can certainly create more than 4...

whenever a thread ends, another thread begins, so that at any given time, 4 threads run concurrently (it was just a guess).

Whenever you have a guess, you should create an experiment to figure out if your guess is correct. Here's one:

use std::{thread, time::Duration};

fn main() {
    let threads: Vec<_> = (0..500)
        .map(|i| {
            thread::spawn(move || {
                println!("Thread #{i} started!");
                thread::sleep(Duration::from_millis(500));
                println!("Thread #{i} finished!");
            })
        })
        .collect();

    for handle in threads {
        handle.join().unwrap();
    }
}

If you run this, you will see that "Thread XX started!" is printed out 500 times, followed by 500 "Thread XX finished!"

Clearly, there's no parallel computing going on here

Unfortunately, your question isn't fleshed out enough for us to tell why your time went up. In the example I've provided, it takes a little less than 600 ms, so it's clearly not happening in serial!

like image 162
Shepmaster Avatar answered Nov 07 '25 01:11

Shepmaster



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!