Page 1 of 5 Running native code from a Python program is one way to speed things up. Find out how it all works in this extract from my new book Programmer's Python: Everything is Data.
Programmer's Python Everything is Data
Is now available as a print book: Amazon
Contents
- Python – A Lightning Tour
- The Basic Data Type – Numbers
Extract: Bignum
- Truthy & Falsey
- Dates & Times
- Sequences, Lists & Tuples
Extract Sequences
- Strings
Extract Unicode Strings
- Regular Expressions
- The Dictionary
Extract The Dictionary
- Iterables, Sets & Generators
Extract Iterables
- Comprehensions
Extract Comprehensions
- Data Structures & Collections
Extract Stacks, Queues and Deques Extract Named Tuples and Counters***NEW!!!
- Bits & Bit Manipulation
Extract Bits and BigNum
- Bytes
Extract Bytes And Strings Extract Byte Manipulation
- Binary Files
- Text Files
- Creating Custom Data Classes
Extract A Custom Data Class
- Python and Native Code
Extract Native Code Appendix I Python in Visual Studio Code Appendix II C Programming Using Visual Studio Code
<ASIN:1871962765>
<ASIN:1871962749>
<ASIN:1871962595>
<ASIN:B0CK71TQ17>
<ASIN:187196265X>
Python is a great language, but there are times when you have to use it to connect to code in another language, usually C. The reasons are varied but the two most common are to add a feature that Python doesn’t have or to make something work faster. Clearly, if you are trying to use Python with “foreign” code either the Python has to make special provision to work in different ways or the foreign code does. In this chapter we look at how to write Python that works with foreign code that knows nothing about Python.
The approach we’ll use is basically using the ctypes module to access functions stored in shared libraries. The shared libraries could be created using almost any language as they are a standard part of the operating system, but for the sake of concreteness we will use C to create them, which is what usually happens. The basic idea is that there are functions in the shared library that you want to call from Python and you need to write Python code that makes this work.
An alternative approach involves writing code that “knows” about Python and can pretend to be an extension of the language. In this case, from the Python programmer’s point of view there is nothing new – the module in the foreign language, usually C, is just used as if it was standard Python. Of course, the cost of making the module compatible now lies with the foreign language programmer. As the standard implementation of Python is based on C, this isn’t as difficult as you might expect. However, it would take us deep into using C and this is beyond the scope of this book.
A common approach to implementing foreign code augmentation of a Python program is to first implement it as a shared library that needs special treatment from Python and then, after it has been proved to work, it is easier to convert to a module that can be used from Python with no special considerations. This can be done by converting the C code into an extension or by simply wrapping the code in a Python module that provides Python functions that make use of it.
The unifying factor in using foreign code of any sort is the need to convert between data representations. In general calling the foreign functions is easy, it is passing them data that makes sense to them and making sense of the data sent back that is difficult. The same problem occurs when using modules, such as os, which interface at a low level with the operating system. This makes data conversion a very useful skill.
It is assumed that you know how to program in C and that you have a C programming language environment set up. If not then see Appendix II which details how to get the GCC compiler working with Visual Studio Code.
Using Shared Libraries
Most operating systems have a shared library feature. Under Linux shared libraries are .so files and under Windows they are DLLs, Dynamic Link Libraries, usually with the extension .dll. The two are not interchangeable and this means that you have to implement them separately for each operating system you want to support. However, the differences between them are very small and the same code can be used for both with the help of a few simple macros.
We need a simple shared library to try out and something that adds two integers and returns the result is sufficient.
For a Windows DLL the code is:
#include <stdio.h>
__declspec(dllexport) int sum(int,int);
int sum(int a, int b){
return a + b;
}
and for a Linux .so library:
#include <stdio.h>
int sum(int a, int b){
return a+b;
}
You can see that the only difference is that the Windows version needs the function to be explicitly marked for export, i.e. external use. in both cases save as mylib.c.
The only complication is that, as a simple executable file isn’t usable as a shared library, you have to compile with the target set to a shared library. When the compiler has finished you should find libmyLib.so or libmyLib.dll in the build folder or wherever the compiler you are using stores its results.
To make use of these libraries we have to use ctypes cdll or windll to load the library file into memory. As it does this it builds a list of the functions that are in the library. The functions are added to the returned object as attributes that can be called. This is an example of dynamic attributes as discussed in the previous chapter. For example, under Linux you would write:
import ctypes
lib=ctypes.cdll.LoadLibrary("build/libmyLib.so")
print(lib.sum(1,2))
and under Windows:
import ctypes
lib=ctypes.windll.LoadLibrary("build/libmyLib.dll")
print(lib.sum(1,2))
Notice that the only changes are the use of windll in place of cdll and the change in the extension of the library files. Also notice that it is a convention for the compiler to add “lib” to the start of the file name. For full details of how to create the library files see Appendix II.
Loading and Finding Libraries
If you know where a library is stored you can simply provide the path to it in the loadlibrary method. If you are not sure where the library is stored on a given system, but you are sure that the system can find it automatically, i.e. it is included in the system path, you can use:
ctypes.util.find_library(name)
The name should be used without the lib prefix and without any .so or .dll suffix and this is unlikely to work for custom libraries that you haven’t stored in any of the standard library folders.
Once you have found the location of the library you need to load it into memory so that it can be used. The most direct way of doing this is to use the LoadLibrary method as described in the previous example. Notice that this loads the library every time, even if it is already loaded, but it is an easy way of doing the job.
For more sophisticated ways of doing the same job there are four library classes:
-
CDLL loads a Linux .so library
-
OleDLL loads a Windows OLE DLL
-
WinDLL loads a Windows DLL
-
PyDLL loads the Python C API shared library
Each of these accepts the pathname of the library file to be loaded as the first parameter and then a set of parameters to control how to load the file. These constructors also load the library so there is no need to make a call to loadLibrary.
To make this job easier there are also four functions that generate pre-configured instances without loading the library:
-
cdll creates CDLL instances
-
oledll creates OleDLL instances
-
windll creates WinDLL linstances
-
pydll creates PyDLL instances
There is also a method, pythonapi, which creates an instance of PyDLL with all of the Python C API as attributes.
Notice that we used the preconfigured windll/cdll to load the example library rather than a fully configured call to WinDLL/CDL, but we could have used:
lib=ctypes.CDLL("build/libmyLib.so")
or:
lib=ctypes.CDLL("build/libmyLib.dll")
The different ways of loading a library can be confusing. You can even load the library by simply referring to it as an attribute, for example:
lib=ctypes.windll.libmyLib
However, this doesn’t work for our custom library because the system cannot find it automatically – you need to specify a full pathname. These alternatives don’t load the library again if it is already loaded.
There is a particular problem with Windows DLLs that contain references to other DLLs, dependencies, which have to be loaded at the same time. If the dependencies cannot be found you will see an error message which contains no clue as to which dependencies are missing. This is not an easy error to fix. As loading shared libraries can be a problem, my preferred method is to explicitly load the library using a fixed path for custom libraries.
|