Multithreading in Python | What is Thread
In Python, or any other programming languages, multithreading refers to the program’s ability to execute multiple threads concurrently or simultaneously.
The process of executing multiple threads (or tasks) simultaneously is called multithreading in Python. In other words, the ability of a CPU (processor) to execute multiple threads concurrently is called multithreading.
Let’s understand it with the help of a real time example. A super busy mom does multiple tasks simultaneously, such as cooking food, taking care of the baby, and attending phone calls. She is doing all the tasks simultaneously, as shown in the below picture to understand clearly.
Red line in the picture shows a female is cooking food. Meanwhile, her baby is crying, so she is taking care of her as well, which is shown by the green color line. At the same time, she is also attending phone calls, which is shown by the blue color. All tasks are running in an active state in parallel.
In this scenario, the busy mom represents the CPU (processor), and the tasks she is performing concurrently (cooking food, taking care of the baby, and attending phone calls) represent individual threads within a single process. We can consider “mom is cooking” as one thread.
Similarly, “taking care of baby” is the second thread and “attending phone call” is the third thread. Each task (thread) runs in parallel, demonstrating the ability to handle multiple operations simultaneously. Thus, mom is doing multithreading (i.e. multitasking).
Likewise, multi-threading is a programming technique which allows a program (or CPU) to execute multiple threads of a single process at the same time. Each thread runs independently while sharing their resources (such as memory) with other threads within the same process.
Multi-threading is used in many application programs to improve the performance and scalability by running multiple tasks simultaneously. For example, a web server uses more than one thread to handle multiple request at the same time. It improves the performance of web server as it can process multiple requests at once.
We can use the multithreading approach to process multiple streams of data concurrently, run multiple tasks simultaneously, or execute multiple segments of code concurrently. We can achieve multithreading in Python using a built-in threading module in the standard library, which offers the basic functions to create and run threads.
What is a Thread in Python?
In Python, or any other programming languages, a thread simply refers to the basic unit of execution within a computer program. A computer program is a process that has at least one thread, known as main thread, to execute instructions of that process.
A thread is a lightweight process that has a separate flow of execution. This means that we can create several threads running in parallel within the same program.
Each thread can perform different tasks simultaneously and can share CPU and memory with other threads within the same program. Threads are a way to achieve multitasking within the same process and can significantly speed up our program execution.
[adinserter block=”5″]
Single-Threaded Program and Multi-threaded Program
When a program contains a single flow of execution, it is called single-threaded program. In a single thread program, instructions are executed one after the other sequentially. It means that a single-threaded program executes one task at a time.
We can also create more than one execution thread in a program to perform multiple tasks simultaneously. When a program creates more than one thread, with each thread having its own path of execution, the program is referred to as a multi-threaded program.
In such a program, threads run concurrently and perform multiple tasks at the same time, thus improving its performance and efficiency. Each thread within the program runs in parallel with other threads, meaning that all threads are working at the same time on different tasks.
In multi-threaded program, threads can be independent or they can share resources, such as the same memory address, data, and instructions in the program. Threads can also communicate with each other and share resources using synchronization techniques, such as monitors, mutexes, and semaphores.
One of the main advantage of threads in programming is that they can significantly improve the performance and efficiency of a program by allowing multiple tasks to be executed at the same time. This is especially beneficial in applications that need to perform several operations independently of each other, such as in the case of an image processing program. In such a program, threads can be used to execute different tasks, such as loading images, processing images, and generating previews, at the same time.
Limitations and Challenges
It is also important to note that while threads offer many advantages, they also introduce some limitations and challenges in the programming. One of the main challenges is the complexity in managing shared resources and synchronization to prevent issues, such as race conditions and deadlocks. In addition to it, the number of processing cores available in the CPU processor can limit threads.
[adinserter block=”2″]
What is a Process?
A process is a very common term in the computer programming. The execution of any program in your system constitutes a process. The operating system creates and handles it independently. When we double-clicks on a program or an application, such as a word document, browser, etc. in the system, the operating system creates a process.
A process can have multiple threads. Every individual process has its own separate memory address space, stack data, and other auxiliary data that allows it to track its own execution.
The operating system plays a critical role in managing the execution of processes on a computer system. It is responsible for scheduling processes, allocating resources (such as CPU time, memory, and I/O devices) so that multiple programs can run simultaneously without interfering with each other. Each process communicates through the operating system, files, and network. Look at the below figure to understand clearly.
As we can see in the above figure, threads live in the process and share the same memory address, which is not the case with processes. There can be multiple processes inside the operating system. Python also supports multiprocessing that allows us to run multiple unrelated processes simultaneously. These processes do not share their resources and communicate through inter-process communication (IPC) methods.
Key Difference between Process and Thread in Python
Here, we have summarized the basic key differences between process and thread in Python. They are as:
1) Independency:
- Processes are considered independent units of execution by the operating system. Each process runs in its own separate memory space, and the operating system manages these processes independently.
- Threads within the same process are considered dependent on each other due to their shared memory space and resources.
2) Memory sharing:
- Processes have different (separate) memory space.
- Threads within the same process can share the memory space.
3) Data and Code sharing:
- Processes have independent data and code segments in their memory space. Each process runs in its own separate memory space allocated by the operating system.
- On the other hand, threads within the same process share the same data and code segment, files, etc.
4) Overhead:
- Processes generally take longer to create, terminate, and manage compared to threads. This is because processes are heavier-weight entities in the operating system.
- Threads, on the other hand, take less time to create, terminate, and manage compared to processes. This is because threads are lighter-weight entities that exist within the same process.
Characteristics of Multithreading in Python
Here, we have listed some important characteristics of multithreading in Python that you should understand them.
(1) Multithreaded programs can run faster on the computer system, having multiple CPUs (or multiple cores within a single CPU), because these threads execute truly in parallel.
In multiple CPUs, each thread can potentially be assigned to a separate processor or core, allowing them to run simultaneously. This parallel execution can significantly reduce the overall execution time of the program compared to running on a single CPU or core.
(2) A program can remain responsive to input, whether it is running on a system with a single CPU or multiple CPUs.
(3) In a multithreaded process, threads can share the memory allocated for global variables, which means that any modification made to a global variable by one thread is visible and affects all other threads within the same process. A thread can also have local variables.
(4) In a multithreaded program, multiple threads within the same process share the same data space, including the heap and global variables, with the main thread. This facilitates easier and more efficient information sharing and communication among threads compared to separate processes.
(5) Threads are often termed as “lightweight processes” precisely because they do not require much memory overhead and are cheaper than processes.
Thread Control Block (TCB) in Python
All the related information of each thread is stored as a data structure inside the operating system kernel, and this data structure is called thread control block (TCB). The TCB contains the following main components that are as:
Program counter (PC): This component is used to track the execution flow of the program. The program counter stores the address of the instruction currently being executed by a thread.
System registers (REG): These registers are temporary storage areas that hold the current working variables and other critical data needed by the thread during execution.
Stack: The stack is a data structure that stores a record of the execution history or the call history of the thread.
The below figure shows the anatomy of a thread, which has three threads. Each thread has its program counter (PC), a stack, and REG, but can share code segments and other resources, such as data, files, with other threads.
In addition to these components, the TCB also includes a thread identifier, thread state, and a parent process pointer.
The thread Identifier is a unique identifier assigned to every new thread, allowing the operating system and the application to distinguish between different threads within the same process or across different processes.
The thread state signifies the current status of a thread within its lifecycle, such as running, ready, waiting, start, or stopped.
A parent process pointer points to the process control block (PCB) of the parent process in which the thread resides.
Advantages of Multithreading in Python
There are the following advantages of using multithreading in Python. They are as follows:
- Helps to simplify the code structure for applications that perform multiple tasks concurrently. With multithreading, we can assign specific tasks to different threads. This can make the code easier to read, develop, and maintain.
- Allows a program to better utilize system resources.
- Enables both parallel and concurrent programming.
- Increases the performance
- Reduces the response time
In this tutorial, we have elaborated the definition of multithreading in Python with the help of a realtime example. Hope that you will have understood the basic points of multithreading, and differences between process and thread.
Thanks for reading!!!