Programmer's Python Async - Threads
Written by Mike James   
Tuesday, 31 January 2023
Article Index
Programmer's Python Async - Threads
Local Variables
Thread Local Storage

If you have a set of CPU-bound Python threads then the GIL determines that only one thread is running at any given time and there is no potential speedup. If you want to speed up a program using multiple cores then you need to use processes rather than threads. However, all is not lost. A multi-threaded program can be faster than a single-threaded program if the threads are mostly I/O-bound. When a thread does any I/O it generally has to wait for the operation to complete and it releases the GIL, allowing another thread to start execution. This means that a set of I/O-bound threads will run faster even with the GIL

As the GIL is a lock on the Python interpreter it is also freed if a thread calls C code to do something. That is, a thread usually releases the GIL if it isn’t running Python code. The “usually” is because the Python code has to explicitly release the GIL and it can be difficult to work out which Python instructions actually free the GIL. For example, if you compute x=sin(t) then the GIL is released while the C function that performs the computation gets on with its job. This means that the execution of sin(t) can make use of additional cores if available.

A Python CPU-bound thread also gives up the GIL every so often to give other threads a chance to run. This aspect of the GIL was changed in Python 3.9 to make it work better. Originally a thread holding the GIL was allowed to execute a fixed number of Python byte codes. Now it runs for a maximum time before relinquishing the GIL and allowing another thread to run. The operating system decides which of the waiting threads gets to run.

You can find out what this time interval is and set it using:


Currently the default is 0.005 s, i.e. 5 ms. Given that switching threads is an expensive operation the value should be set high, but you can improve the response time of a program by lowering it. Notice that the system may not set the exact value you specify – it could be longer. Also the default of 5 ms is very long by comparison with the execution times of many threads and so it is often possible for a thread to run to completion without being interrupted by another thread.

To summarize:

  • Only one thread has the GIL and hence is running Python code at any given time, no matter how many cores the machine has.

  • A thread that starts to execute non-Python code, usually C code, should give up the GIL and allow another thread to run Python code.

  • A thread gives up the GIL and allows another thread to run if it starts an I/O or other operation that causes it to have to wait.

  • A thread also gives up the GIL after switchinterval seconds and allows another thread to acquire the GIL and run.

The GIL is a confusing factor when you are trying to reason about the behavior of a threaded program. Things don’t always work as you would expect from a consideration of the way the operating system handles threading. In this sense the GIL gets in the way of the OS scheduler and stops it from doing its job.

Threading Utilities

The threading module provides some general purpose functions for finding out about threads:


the number of active threads


a list of currently active threads


the Thread object of the current thread


a Thread object for the main thread


the ‘thread id’ of the current thread


the native id of the current thread

Daemon Threads

Like processes, threads can be daemon or non-daemon. The default is daemon = False, i.e. a non-daemon thread. A non-daemon thread will keep the process alive until all non-daemon threads have ended. As remarked in the chapter on processes this is counter to the usual Linux/Unix daemon which runs in the background, independent of any other process or user interaction. Consider for example:

import threading
def myThread():
    while True:

If you run this you will see the native_id of each thread printed followed by ending, but the process will not end. If you check you will find that all three threads are still running. The idea of a daemon thread is subtle in that any non-daemon threads that are running will stop the main thread from exiting and hence it will not automatically stop any other threads from running, daemon or non-daemon. When all of the non-daemon threads end the main thread can end and this brings any daemon threads that are still running to an end.

Under Windows you need to download and install Process Explorer to view threads. Under Linux use ps with the -t option. Also you cannot always stop a thread using the keyboard break Ctrl- c. Use the Process Explorer or the Kill command under Linux. If you run this program under an IDE and debugger then you are likely to see a more complicated result than just three threads running as described.

Waiting for a Thread

All of the threads that a program creates run within a single process. You can arrange for one thread to wait on another to complete using the Thread object’s join method:


This waits for the thread to finish or for the specified timeout. Notice that you have to use Thread.is_alive() to discover if a thread terminated or simply timed out. If you don’t specify a timeout the join waits until the thread terminates.

Unlike a process there is no easy and consistent way that one thread can terminate another. That is, there is no thread equivalent of process.kill or process.terminate. The only easy way to stop a thread is via programmed cooperation. That is, the thread has to monitor a shared resource and stop itself when asked to do so by another thread. This is explained in the context of events in Chapter 6.

If you want to wait for multiple threads to finish their tasks then you can simply use multiple joins:

print("all finished")

The print is only executed when all three threads are complete.

There is no easy way to wait for the first of a set of threads to complete. This is a surprise for many programmers used to other languages. There are ways to do it, but there is no single method that will wait for the first thread of a set to complete. Notice that the simple-minded approach of repeatedly checking is_alive() for each thread isn’t a good method as it keeps the waiting thread busy, so tying up the CPU, while waiting for a thread to complete. This is particularly bad given that the GIL means that only one thread can be running at any given time.

So you should not use constructs like:

for t in threading.enumerate():
     if not t.is_alive():

to repeatedly check for a completed thread unless you need to keep the waiting thread doing something. Possible solutions to this problem are given in Chapter 6 using a semaphore and in Chapter 11 using Futures.

Last Updated ( Tuesday, 31 January 2023 )