Multiprocessing Basics
The multiprocessing module allows you to spawn processes, bypassing the Global Interpreter Lock (GIL) by using separate memory spaces.
import multiprocessing
import time
# Simple process function
def worker(num):
print(f"Worker {num} started")
time.sleep(2)
print(f"Worker {num} finished")
# Create and start processes
if __name__ == '__main__':
processes = []
for i in range(5):
p = multiprocessing.Process(target=worker, args=(i,))
processes.append(p)
p.start()
# Wait for all processes to complete
for p in processes:
p.join()
print("All workers completed")
Each process runs in its own Python interpreter with its own GIL, allowing true parallel execution on multi-core systems.
Process Communication
Processes can communicate using Queues and Pipes, which are specially designed for inter-process communication (IPC).
from multiprocessing import Process, Queue
# Function to square numbers and put results in a queue
def square_numbers(numbers, queue):
for num in numbers:
queue.put(num * num)
# Function to cube numbers and put results in a queue
def cube_numbers(numbers, queue):
for num in numbers:
queue.put(num * num * num)
# Main process
if __name__ == '__main__':
numbers = range(1, 6)
queue = Queue()
# Create processes
p1 = Process(target=square_numbers, args=(numbers, queue))
p2 = Process(target=cube_numbers, args=(numbers, queue))
# Start processes
p1.start()
p2.start()
# Wait for processes to finish
p1.join()
p2.join()
# Get results from queue
results = []
while not queue.empty():
results.append(queue.get())
print("Results:", results) # [1, 1, 4, 8, 9, 27, 16, 64, 25, 125]
Queues are thread and process safe, making them ideal for passing messages between processes.
Shared Memory
Multiprocessing provides shared memory objects (Value and Array) for data that needs to be accessed by multiple processes.
from multiprocessing import Process, Value, Array
# Function to modify shared data
def worker(shared_num, shared_arr):
shared_num.value += 1
for i in range(len(shared_arr)):
shared_arr[i] *= 2
# Main process
if __name__ == '__main__':
# Create shared objects
num = Value('i', 0) # 'i' for integer
arr = Array('d', [1.0, 2.0, 3.0]) # 'd' for double
# Create and start processes
processes = []
for _ in range(4):
p = Process(target=worker, args=(num, arr))
processes.append(p)
p.start()
# Wait for all processes to complete
for p in processes:
p.join()
print("Shared number:", num.value) # 4 (each process increments by 1)
print("Shared array:", arr[:]) # [16.0, 32.0, 48.0] (each process doubles values)
For more complex synchronization, use Locks to prevent race conditions when modifying shared state.
Process Pools
The Pool class provides a convenient way to parallelize execution of a function across multiple input values.
from multiprocessing import Pool
import time
# Function to be parallelized
def square(x):
time.sleep(0.5) # Simulate work
return x * x
# Main process
if __name__ == '__main__':
# Create pool with 4 worker processes
with Pool(4) as pool:
# Map the function to data (parallel execution)
results = pool.map(square, range(10))
print("Squares:", results) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
# Async version with callback
result = pool.apply_async(square, (10,), callback=print)
result.wait() # Wait for async task to complete
Pools manage worker processes for you and provide convenient methods like map(), apply(), and starmap().
Python Multiprocessing Videos
Master Python multiprocessing with these handpicked YouTube tutorials:
Learn the fundamentals of multiprocessing:
Deeper dive into multiprocessing:
Optimizing multiprocessing applications:
Practical multiprocessing examples: