Multi-Threading vs Multi-Processing

Nov 24, 2019 · 7 min read · python programming ·

Let's say you've got a big school project to do. You need to write a report, paint a picture, and build a model. Here's how you might tackle the project under different scenarios:

Multi-processing

This is like having multiple copies of yourself, each working independently on a different part of the project. One of you is writing the report, another one is painting the picture, and the third one is building the model. You can all work at the same time on different tasks. This makes things faster because instead of one person doing everything one after the other, you have multiple "yous" doing different tasks at the same time.

Multi-threading

Now, instead of having multiple copies of yourself, imagine you're a superhero who can move very quickly. You start writing the report, then zoom over to paint a bit of the picture, then zoom over to work on the model, then back to the report, and so forth. You're still one person, but you can switch tasks so quickly it almost seems like you're doing everything at once.

Technically,

In a multi-processing system, multiple processes run concurrently on different cores or CPUs. Each process has its own memory space and runs independently of the others. This is useful when tasks are CPU-intensive and can be run independently. The Operating System manages these processes, assigning them to different cores and handling the inter-process communication.

In a multi-threading system, multiple threads run concurrently within the same process. Each thread shares the same memory space and resources of the process. This is useful when tasks are I/O-intensive and can be run concurrently. The Operating System manages these threads, scheduling them to run on different cores and handling the inter-thread communication.

When to use Multi-threading vs Multi-processing?

Let's look into some examples where each technique might be best suited:

Multi-Processing

Data Analysis and Scientific Computing: Multi-processing is often used in data analysis, machine learning, or scientific computing where a large amount of data is processed. If the computations are independent and CPU-intensive, they can be distributed across multiple processes to utilize multiple cores and CPUs. This can lead to significant improvements in execution time. Threads might not be as beneficial here because they operate within the same process and share the same memory, which does not take full advantage of multiple cores or CPUs.
Concurrency in Microservices Architecture: In microservices architecture, where different services are decoupled and perform different tasks, multi-processing allows each service to run in a separate process, isolated from others. This approach enhances the reliability of the system, as a failure in one service (process) does not directly affect the others. Also, each microservice can be developed, deployed, and scaled independently.

Multi-Threading

Web Servers: A web server needs to handle multiple incoming client requests at the same time. With multi-threading, each request can be handled by a separate thread. This is more efficient than multi-processing because threads share memory and can communicate with each other more easily than separate processes can. Creating a new thread for each request is also generally faster and uses less resources than creating a new process.
Graphical User Interfaces (GUIs): In applications with a GUI, multi-threading is often used to keep the interface responsive. For example, one thread can handle user input, while another performs background tasks. Without this separation, the GUI could become unresponsive when performing a long-running task. Using multi-processing in this case would be overkill and would not provide any significant benefits.

Understanding whether a task is I/O-bound or CPU-bound

Understanding whether a task is I/O-bound or CPU-bound can help determine the best strategy to handle concurrency in your application. Here are some guidelines to identify each type of task:

I/O-Bound Tasks: These are tasks where the program spends most of its time waiting for Input/Output (I/O) operations to complete. This could include things like reading from or writing to disk files, making network requests (like API calls or database queries), or getting user input.

For example, if your program is downloading files from the internet, the time it takes to complete is largely dependent on your network speed and the server's response time, not how fast your CPU can process data. The CPU is just waiting most of the time, making this an I/O-bound task.

CPU-Bound Tasks: These are tasks where the program spends most of its time performing computations, and the speed at which they complete is largely dependent on the speed of the CPU.

For example, if your program is performing complex mathematical calculations, encoding/decoding video, or processing large amounts of data, the time it takes to complete these tasks is largely dependent on how fast your CPU can process these calculations. These tasks are keeping the CPU busy, making them CPU-bound tasks.

To optimize I/O-bound tasks, you might consider using multi-threading or asynchronous programming. This can allow your program to continue doing work while waiting for I/O operations to complete, rather than just waiting around.

For CPU-bound tasks, you might consider using multi-processing, as this can allow your program to take advantage of multiple CPUs or cores and perform computations in parallel.

Implementing Multi-threading in Python

In Python, threads can be implemented with the use of threading module or concurrent.futures . Now let’s consider a function that is used to download an image — this is clearly a I/O-bound task:

 1import requests
 2
 3
 4def download_img(img_url: str):
 5	"""
 6	Download image from img_url in curent directory
 7	"""
 8	res = requests.get(img_url, stream=True)
 9	filename = f"{img_url.split('/')[-1]}.jpg"
10
11	with open(filename, 'wb') as f:
12		for block in res.iter_content(1024):
13			f.write(block)

Now, let’s try to download a few images from Unsplash using the code snippet below.

 1import requests
 2
 3
 4def download_img(img_url: str):
 5	res = requests.get(img_url, stream=True)
 6	filename = f"{img_url.split('/')[-1]}.jpg"
 7
 8	with open(filename, 'wb') as f:
 9		for block in res.iter_content(1024):
10			f.write(block)
11
12
13if __name__ == '__main__':
14    # a list of different image URLs
15    images = [
16    	'https://images.unsplash.com/photo-1509718443690-d8e2fb3474b7',
17    	'https://images.unsplash.com/photo-1587620962725-abab7fe55159',
18    	'https://images.unsplash.com/photo-1493119508027-2b584f234d6c',
19    	'https://images.unsplash.com/photo-1482062364825-616fd23b8fc1',
20    	'https://images.unsplash.com/photo-1521185496955-15097b20c5fe',
21    	'https://images.unsplash.com/photo-1510915228340-29c85a43dcfe',
22        'https://images.unsplash.com/photo-1509718443690-d8e2fb3474b7',
23    	'https://images.unsplash.com/photo-1587620962725-abab7fe55159',
24    	'https://images.unsplash.com/photo-1493119508027-2b584f234d6c',
25    	'https://images.unsplash.com/photo-1482062364825-616fd23b8fc1',
26    	'https://images.unsplash.com/photo-1521185496955-15097b20c5fe',
27    	'https://images.unsplash.com/photo-1510915228340-29c85a43dcfe',
28    ]
29
30    for img in images:
31        download_img(img)

The above code is downloading images one by one which can be quite slow, especially if there are a lot of images. We can use threading to download multiple images at once. In Python, the concurrent.futures module provides a high-level interface for asynchronously executing callables, which simplifies multi-threading.

 1import requests
 2import concurrent.futures
 3
 4def download_img(img_url: str):
 5	res = requests.get(img_url, stream=True)
 6	filename = f"{img_url.split('/')[-1]}.jpg"
 7
 8	with open(filename, 'wb') as f:
 9		for block in res.iter_content(1024):
10			f.write(block)
11
12if __name__ == '__main__':
13    # a list of different image URLs
14    images = [
15    	'https://images.unsplash.com/photo-1509718443690-d8e2fb3474b7',
16    	'https://images.unsplash.com/photo-1587620962725-abab7fe55159',
17    	'https://images.unsplash.com/photo-1493119508027-2b584f234d6c',
18    	'https://images.unsplash.com/photo-1482062364825-616fd23b8fc1',
19    	'https://images.unsplash.com/photo-1521185496955-15097b20c5fe',
20    	'https://images.unsplash.com/photo-1510915228340-29c85a43dcfe',
21        'https://images.unsplash.com/photo-1509718443690-d8e2fb3474b7',
22    	'https://images.unsplash.com/photo-1587620962725-abab7fe55159',
23    	'https://images.unsplash.com/photo-1493119508027-2b584f234d6c',
24    	'https://images.unsplash.com/photo-1482062364825-616fd23b8fc1',
25    	'https://images.unsplash.com/photo-1521185496955-15097b20c5fe',
26    	'https://images.unsplash.com/photo-1510915228340-29c85a43dcfe',
27    ]
28
29    with concurrent.futures.ThreadPoolExecutor() as executor:
30        executor.map(download_img, images)

In the revised code, we use concurrent.futures.ThreadPoolExecutor to create a pool of threads, and then we use the map method, which assigns a callable to each thread in the pool. The threads then execute the callables concurrently, which in this case means downloading multiple images at once. This should make the code faster when downloading a large number of images.

Implementing Multi-processing in Python

Let's take the task of calculating the factorial of a large number. This is a CPU-bound task, as it involves lots of calculations. First, we will implement it in a normal (synchronous) way, and then we will optimize it using multi-processing.

 1import math
 2
 3def calculate_factorial(n):
 4    return math.factorial(n)
 5
 6if __name__ == "__main__":
 7    numbers = [10, 20, 30, 40, 50] * 100000  # Some large numbers
 8
 9    for num in numbers:
10        print(f"The factorial of {num} is {calculate_factorial(num)}")

Now, let's optimize the code using multi-processing.

 1import concurrent.futures
 2import math
 3
 4def calculate_factorial(n):
 5    return n, math.factorial(n)
 6
 7if __name__ == "__main__":
 8    numbers = [10, 20, 30, 40, 50] * 100000  # Some large numbers
 9
10    with concurrent.futures.ProcessPoolExecutor() as executor:
11        for number, factorial in executor.map(calculate_factorial, numbers):
12            print(f"The factorial of {number} is {factorial}")

Here, we used concurrent.futures.ProcessPoolExecutor() to create a pool of worker processes. The executor.map() function then applies the calculate_factorial function to every item in the numbers list, distributing the tasks among the worker processes. Each process will calculate the factorial of a number independently of the others, which allows for parallel execution and can provide a significant speedup for large lists and CPU-bound tasks.

Author: Sadman Kabir Soumik