|Goodbye GIL - But Will It Make Python Faster?|
|Written by Mike James|
|Wednesday, 02 August 2023|
The obvious answer is - it all depends on what you mean by "fast". It will make some things better and inevitably make some things worse. But after agonizing for a long time, the fate of the GIL is sealed.
First a word about what the Global Interpreter Lock (GIL) is. It's a lock that prevents the Python interperter being run by more than one thread at a time. This means that if you are writing a multithreaded program in Python only one of the threads can be running Python at any given time. In the days of single-cored processors this wasn't a big deal but today, with multicore processors capable of running more than one thread at the same time - it probably still isn't a big deal...
This isn't a popular view, but the fact of the matter is that the vast majority of Python program aren't going to be speeded up by removing the GIL. You can argue that it's not the run of the mill program that is the target for this change, but the big Python extensions and applications like Numpy. If Numpy could run concurrently then a lot of number crunching programs would go faster and getting rid of the GIL would be worthwhile. The error is that most extensions that need to get rid of the GIL have already found ways round it. There isn't much stopping a C extension from releasing the GIL and going multithreaded and this is what Numpy has done.
If you are sticking to pure Python, then the GIL does have some impact on you - but only if you want to write a program using Python threads. If you write a program to compute Pi, for example, then using two threads to run it won't increase the speed at all. If, however, you split the program into two processes then it will run twice as fast. Python processes mean you already circumvent the GIL at the cost of throwing a little more memory at the problem. Then there is asyncio to consider. The whole idea of asyncio is that for I/O you don't really need true concurrency and a single thread will do.
Putting all this together it is difficult to see who benefits from dropping the GIL, but I admit some people will, just not a lot of them. And before you argue, notice that there are no real stats on the issue.
The advantage of keeping the GIL is that it allows easy porting of C programs to Python as C extensions. This is often claimed to be one of the big reasons that Python became popular - although I'm not at all sure about this urban legend either. Over time C extensions have found their own ways of dealing with the GIL or have just accepted it.
But for a strange set of reasons, mainly because it fits in with the "Make Python Faster" bandwagon, Pythonistas seem fixated on removing the GIL from CPython - the reference implementation of Python. Notice that the GIL is not part of the Python language, only its implementation, and there are other implementations that don't use it.
After much deep consideration, a vote of core devs was taken and the results were conclusively for the GILectomy:
You might think that this is a small poll, but remember these are the people you might expect to have valid opinions on the subject. I'm not so sure - core devs are enthusiastic about the implementation but not so much on its use. So it's a bit like asking a hardware guy if the machine needs an upgrade - the answer is usually yes even if the software is working just fine.
After the result the Steering Council issued the following:
Thank you, everyone, for responding to the poll on the no-GIL proposal. It’s clear that the overall sentiment is positive, both for the general idea and for PEP 703 specifically. The Steering Council is also largely positive on both. We intend to accept PEP 703, although we’re still working on the acceptance details.
They estimate that the work will take five plus years and so there is no need to panic - yet. And they state that they are committed to a single version of Python with the backward compatibility so allaying the accustation that this is going to be Python 4 with all the terrible upheaval that might generate.
Overall caution seems to be the name of the game:
Throughout the process we (the core devs, not just the SC) will need to re-evaluate the progress and the suggested timelines. We don’t want this to turn into another ten year backward compatibility struggle, and we want to be able to call off PEP 703 and find another solution if it looks to become problematic, and so we need to regularly check that the continued work is worth it.
Out of interest how will they go about the job?
There are four problems for removing the GIL:
The general approach seem to be to find either lock-free or reduced locking solutions. For example, the simplest way of making reference counting thread safe is to get every thread to lock access to the reference count of an object. To avoid having to lock the reference count on every access, the idea is to distribute it. Each thread will now keep its own count. Other counts and the garbage collector will sum them up to find the grand reference count. However, the garbage collector will need to pause the entire system while it checks counts and a new GC state for each thread is needed.
Currently memory allocation isn't thread safe and the idea is to replace pymalloc with mimalloc, which is a well-known allocator. This part sounds easy, but I bet it isn't.
Container thread-safety is likely to be the most messy. Multiple updates to "extended" objects, like lists or dicts, are not thread safe. The GIL currently makes them thread safe by effectively making them atomic. The only reasonable thing to do here it to introduce per-object locks for non-atomic operations.
Avoiding locking when an operation is provably atomic is a good way of keeping the cost of locking down - the GIL may be gone, but there are still going to be locks. Using lock-free methods would introduce too much indeterminacy into the Python runtime. Lock-free methods are usually very fast, but open ended on worst time performance.
I've only touched on a tiny fraction of the considerations involved in such a major change, but at the end of the day, given the complexity of the system, it is going to be very difficult to not make GIL-free Python slower overall and very, very difficult not to create backward incompatibilities. Fortunately, there is a new slot in extensions, Py_mod_gil , which if absent causes the interpreter to pause all threads, enable the GIL and continue loading.
I predict that we are about to enter a whole new age of interesting bugs in Python programs.
or email your comment to: email@example.com
|Last Updated ( Wednesday, 02 August 2023 )|