Programmer's Python: Async - Locks |
Written by Mike James | |||
Wednesday, 17 September 2025 | |||
Page 2 of 2
The important point is that you cannot predict what this very simple program will produce when it is run. In a system that is organized to not allow a thread to interrupt a running CPU-bound thread then you will get the “correct” answer of 200000. On a machine that allows threads to interrupt each other with less restraint you will get a lower value. The actual behavior of the program depends on its timing and the way that the GIL interacts with the operating system’s scheduling method. The point is that this code is not deterministic in the sense that you cannot predict what it does just by reading its code. You might object that the function being used to demonstrate this is contrived and would never be created in practice, but it is a simplified model of what most functions do when they access a shared resource – read the resource, do some computation and finally save the new result to the resource. In practice the reason for the race condition is usually much more difficult to see. Hardware Problem or Heisenbug?The example of a race condition just given is optimized to increase the probability that the condition will occur. In the real world programs generally have a lower probability of creating a race condition and the result might well be what you are expecting when you run it many times. Eventually, however, the conditions will be right and the program will give the wrong result. This means that the program will likely pass testing and only show an error very occasionally, usually when most damage can be incurred. Such bugs are usually referred to as “non-deterministic” because you can run the program under the same conditions and get different results. Often the first response is to test, or even replace, the hardware and this increases the time it takes even to realize that there is a software bug waiting to be found. Such bugs are very difficult to locate because they are very difficult to reproduce. They are often labeled as Heisenbugs because any attempt to find them tends to make them disappear. Running a program with a race condition in a debugger, for example, can make the probability of it occurring go to zero. Similarly, adding debugging statements can modify the timings so as to make the problem vanish – until they are removed and the program put back into general use. The only secure and reasonable solution to the problem is to use locking. LocksA lock is a co-operative mechanism for restricting access to a resource. The important point here is “co-operative”. It needs to be clear right from the start that a locking mechanism only works if you implement it correctly in all of the code that makes use of the shared resource. There is nothing stopping code that does not make use of the lock from accessing the resource. This is a general feature of locking in most operating systems and isn’t specific to Python. The simplest type of lock has just two states – locked and unlocked. Any code that wants access to a resource that will not be interrupted by another thread has to acquire the lock by changing it to the locked state. If a thread tries to acquire the lock that is already locked then the thread has to wait for it to be unlocked. The Lock class behaves exactly as described. It is a wrapper for a lock that is implemented by the operating system. In other words, the Python Lock is an operating system construct. It corresponds to the most basic type of lock, usually called a mutex, for Mutually Exclusive lock. It has an acquire method: Lock.acquire(blocking=True, timeout=- 1) and a release method: Lock.release() The blocking parameter determines what happens if the lock cannot be acquired. If it is True, the default, then the thread simply waits until the lock is available. The acquire returns True when the lock is acquired and you can set a timeout for the wait. Its default is -1 which means “wait forever”. If the acquire returns because of the timeout then it returns False. Alternatively you can set blocking to False and then acquire returns immediately with True if the lock has been acquired or False if it has not. In this case you cannot specify a timeout. If a thread has the lock then it has to release it when it has finished modifying the resource, using the release method. Any thread, not just the thread that has the lock, can release it and this can be a problem. If you try to release a lock that isn’t locked then you generate a RuntimeError. When a lock is released the thread that released it carries on running until it gives up the GIL and another thread gets a chance to run. If there are multiple threads waiting to acquire the lock then the operating system picks just one of them to run and the others have to again wait until it releases the lock that it has just acquired.
Notice that which thread gets to run when a lock becomes available depends on the operating system and you cannot rely on any particular order of execution. That is, if threads A, B and C attempt to acquire the lock in that order they don’t necessarily run in that order when the lock is released. If we add a lock to the function in the previous example then it always returns the correct result no matter how many threads are used to execute it: myCounter=0 countlock=threading.Lock() def count(): global myCounter for i in range(100000): countlock.acquire() temp=myCounter+1 x=sqrt(2) myCounter=temp countlock.release() In this example we acquire the lock before accessing the global variable myCounter and release it after it has been completely updated. As long as all threads use the same locking then only one thread can access the resource and the program is fully deterministic. It never misses an update due to overlapped access. This works, but it slows things down. The unlocked, but incorrect version, runs two threads in about 70 ms whereas the locked version takes 150 ms. The overhead isn’t due to any loss of parallelism as with the GIL in place there isn’t any. The overhead is entirely due to the cost of locking and unlocking. In principle, you should always arrange for a thread to keep a lock for the shortest possible time to allow other threads to work. However, this doesn’t take the GIL into account. If you change the program so that it keeps the lock for the duration of the loop. i.e. until it has very nearly finished. then it is still deterministic, but it only takes about 70 ms with two threads: myCounter=0 def count(): global myCounter countlock.acquire() for i in range(100000): temp=myCounter+1 x=sqrt(2) myCounter=temp countlock.release() In other words, as the GIL only allows one thread to run at a time and as all of the threads are CPU-bound, there is no time advantage in releasing the thread early. The story would be different if some of the threads were I/O-bound because then releasing the lock might give them time to move on to another I/O operation and so reduce the overall runtime. In chapter but not in this extract
Summary
Programmer's Python:
|
|||
Last Updated ( Wednesday, 17 September 2025 ) |