Multi-Threading vs Multi-Processing

Nov 24, 2019 · 5 min read · python programming ·

Let's first understand what is Threading:

Threading

Threading is a technique for writing computer programs that run concurrently, or at the same time, in order to maximize the use of the available processing power. In other words, threading allows you to run multiple tasks within a single process, with each task running on a separate thread.

Threading can be useful for a number of reasons. For example, it can help you make better use of your computer's resources by allowing multiple tasks to run concurrently. It can also make your program more responsive by allowing some tasks to run in the background while others continue to run in the foreground.

In most programming languages, including Python, implementing threading typically involves creating separate threads for each task that you want to run concurrently, and then using synchronization mechanisms such as locks or semaphores to ensure that the threads do not interfere with each other.

Multi-threading

Multi-threading can improve performance in Python by allowing you to execute multiple operations concurrently. This can be especially useful when working with I/O-bound or network-bound programs, where a large amount of time is spent waiting for external operations to complete.

Multi-processing

Multi-processing is a technique for improving the performance of a Python program by running multiple processes concurrently. This can be especially useful for CPU-bound or memory-bound programs, where a single process may not be able to fully utilize the available resources.

Multi-threading vs Multi-processing

The difference between multi-threading and multi-processing is that multi-threading allows you to run multiple threads in a single process, while multi-processing allows you to run multiple processes concurrently.

Multi-threading can be useful for running multiple tasks within a single process that can run concurrently. This can help improve the performance of your program, as multiple threads can run at the same time on a multi-core processor. However, multi-threading can also introduce some challenges, such as the need to manage shared resources and avoid race conditions.

On the other hand, multi-processing allows you to run multiple processes concurrently, each with its own separate memory space. This can be useful for running separate, independent tasks that don't need to share resources or communicate with each other. However, multi-processing can be more complex to set up and manage than multi-threading.

Implementing Threading in Python

Here is an example of how multi-threading can be used to improve the performance of a simple network program that retrieves data from a remote server:

 1import threading
 2import urllib.request
 3
 4# A list of URLs to fetch
 5urls = [
 6    "http://www.example.com/data1.json",
 7    "http://www.example.com/data2.json",
 8    "http://www.example.com/data3.json",
 9    # ...
10]
11
12# This function retrieves data from a URL and prints it to the console
13def fetch_url(url):
14    response = urllib.request.urlopen(url)
15    data = response.read()
16    print(data)
17
18# Create a new thread for each URL
19threads = [threading.Thread(target=fetch_url, args=(url,)) for url in urls]
20
21# Start all threads
22for thread in threads:
23    thread.start()
24
25# Wait for all threads to complete
26for thread in threads:
27    thread.join()

We have a list of URLs that we want to fetch data from. Instead of fetching the data from each URL one at a time, we use the threading module to create a new thread for each URL. We then start all of the threads simultaneously, which allows us to retrieve the data from all of the URLs concurrently. This can improve the performance of the program by allowing it to take advantage of idle time and potentially make better use of available network bandwidth.

Of course, the actual performance improvement will depend on the specific details of the program and the network environment. In some cases, multi-threading may not provide any benefit or may even decrease performance. It is always important to carefully evaluate the potential benefits and drawbacks of using multi-threading in your programs.

Implementing Multi-processing in Python

Here is an example of how multi-processing can be used to improve the performance of a simple program that calculates the sum of a large list of numbers:

 1import multiprocessing
 2
 3# A large list of numbers
 4numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
 5           11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
 6           # ...
 7]
 8
 9# This function calculates the sum of a portion of the numbers list
10def sum_numbers(start, end):
11    total = 0
12    for i in range(start, end):
13        total += numbers[i]
14    return total
15
16# Divide the numbers list into 10 equal-sized chunks
17chunk_size = len(numbers) // 10
18chunks = [(i * chunk_size, (i + 1) * chunk_size) for i in range(10)]
19
20# Create a new process for each chunk of numbers
21processes = [multiprocessing.Process(target=sum_numbers, args=chunk) for chunk in chunks]
22
23# Start all processes
24for process in processes:
25    process.start()
26
27# Wait for all processes to complete
28for process in processes:
29    process.join()
30
31# Calculate the final sum by adding together the results from each process
32total = sum(process.result for process in processes)
33print(total)

We have a large list of numbers that we want to sum. Instead of calculating the sum of the entire list in a single process, we use the multiprocessing module to create a new process for each chunk of the numbers list. We then start all of the processes simultaneously, which allows us to calculate the sum of each chunk concurrently. This can improve the performance of the program by allowing it to take advantage of multiple CPU cores and potentially reduce the time needed to calculate the final sum.

Author: Sadman Kabir Soumik