Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiprocessing or Multithreading?

I'm making a program for running simulations in Python, with a wxPython interface. In the program, you can create a simulation, and the program renders (=calculates) it for you. Rendering can be very time-consuming sometimes.

When the user starts a simulation, and defines an initial state, I want the program to render the simulation continuously in the background, while the user may be doing different things in the program. Sort of like a YouTube-style bar that fills up: You can play the simulation only up to the point that was rendered.

Should I use multiple processes or multiple threads or what? People told me to use the multiprocessing package, I checked it out and it looks good, but I also heard that processes, unlike threads, can't share a lot of information (and I think my program will need to share a lot of information.) Additionally I also heard about Stackless Python: Is it a separate option? I have no idea.

Please advise.


2 Answers

"I checked it out and it looks good, but I also heard that processes, unlike threads, can't share a lot of information..."

This is only partially true.

Threads are part of a process -- threads share memory trivially. Which is as much of a problem as a help -- two threads with casual disregard for each other can overwrite memory and create serious problems.

Processes, however, share information through a lot of mechanisms. A Posix pipeline (a | b) means that process a and process b share information -- a writes it and b reads it. This works out really well for a lot things.

The operating system will assign your processes to every available core as quickly as you create them. This works out really well for a lot of things.

Stackless Python is unrelated to this discussion -- it's faster and has different thread scheduling. But I don't think threads are the best route for this.

"I think my program will need to share a lot of information."

You should resolve this first. Then, determine how to structure processes around the flow of information. A "pipeline" is very easy and natural to do; any shell will create the pipeline trivially.

A "server" is another architecture where multiple client processes get and/or put information into a central server. This is a great way to share information. You can use the WSGI reference implementation as a way to build a simple, reliable server.

like image 117
S.Lott Avatar answered Sep 14 '25 07:09

S.Lott


  • Stackless: uses 1 cpu. "Tasklets" must yield voluntarily. The preemption option doesn't work all the time.
  • Threaded: uses 1 cpu. Native threads share time somewhat randomly after running 20-100 python opcodes.
  • Multiprocessing: uses multiple cpu

Update

Indepth Analysis

Use threaded for an easy time. However, if you call C routines that take a long time before returning, then this may not be a choice if your C routine does not release the lock.

Use multiprocessing if it is very limited by cpu power and you need maximum responsiveness.

Don't use stackless, I have had it segfault before and threads are pretty much equivalent unless you are using hundreds of them or more.

like image 25
Unknown Avatar answered Sep 14 '25 06:09

Unknown