As a c++ and python developer, have you ever wondered on how you can use both languages in a single project ?
Using C and python, it is extremely easy to do so as python even has a built-in module called cython. But i kind of struggled to find a decent and easy to use c++ library which can port c++ code to python.
In the past, an excellent library named Boost.python has been made, but here is the catch: Boost is an enormously large and complex suite of utility libraries that works with almost every C++ compiler in existence. This compatibility has its cost: arcane template tricks and workarounds are necessary to support the oldest and buggiest of compiler specimens. Now that C++11-compatible compilers are widely available, this heavy machinery has become an excessively large and unnecessary dependency.
Installing
Be sure to have gcc, make and cmake installed to be able to follow these steps.
1
2
3
4
5
6
$ git clone https://github.com/pybind/pybind11 # clone the project's github repository
$ cd pybind11
$ mkdir build
$ cd build
$ cmake ..
$ make check -j 4 # compile and run the tests
Example : py_fast_int_stack
Ok, now that we’ve installed all the required modules, let’s build a simple project with the library.
We are going to build a very simple stack module using c++ then we will use it in our python code, then we will use a benchmarking library to check which code is faster and if it is, by how much it is.
First of all I assume the reader has a basic understanding of python and a solid understanding of some c++ principles.
Let’s start by building the project’s folder.
1
2
3
$ mkdir ffr-py
$ cd ffr-py
$ touch module.cpp
Then, using your favorite text editor, open the module.cpp file.
Ensure that these 2 lines are always at the start of your code every time you want to make a project with pybind11.
1
2
3
#include <pybind11/pybind11.h>
namespace py = pybind11;
Let me first showcase the code, then i’ll explain it later.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include <pybind11/pybind11.h>
namespace py = pybind11;
class element{
public:
int value;
};
class stack{
public:
stack(){
size_ = 0;
}
void push(int x){
els[size_].value = x;
size_ ++;
}
void pop(){
els[size_ - 1].value = NULL;
size_ --;
}
int size(){
return size_;
}
int top(){
return els[size_ -1].value;
}
private:
element els[2000];
int size_;
};
PYBIND11_MODULE(module, m) {
m.doc() = "pybind11 example plugin"; // optional module docstring
py::class_<stack>(m, "Stack")
.def(py::init<>())
.def("push",&stack::push)
.def("pop",&stack::pop)
.def("size", &stack::size)
.def("top", &stack::top);
}
I shall not explain the code from line 5 to line 33 as it is pretty obvious for someone who has a basic understanding of c++.
PYBIND11_MODULE
Now let me explain this seemingly complicated code.
1
2
3
4
5
6
7
8
9
PYBIND11_MODULE(module, m) {
m.doc() = "pybind11 example plugin"; // optional module docstring
py::class_<stack>(m, "Stack")
.def(py::init<>()) // the init method
.def("push",&stack::push) // push function
.def("pop",&stack::pop) // pop function
.def("size", &stack::size) // size function
.def("top", &stack::top); // top function
}
This code simply transforms our c++ class and it’s method into a python class. If you want to know more on using pybind11 on object oriented code, checkout the docs .
Let us compile the code :
1
$ c++ -O3 -Wall -shared -std=c++11 -fPIC $(pkg-config --cflags --libs python3) `python3 -m pybind11 --includes` module.cpp -o module `python3-config --extension-suffix
A file called module.cpython-38-x86_64-linux-gnu.so
should be generated.
Be sure to be in the same folder as that file when you launch the python REPL.
1
2
3
4
5
6
7
8
9
10
$ python
>>> from module import Stack
>>> stack = Stack()
>>> stack.push(12)
>>> stack.push(2000)
>>> stack.top()
2000
>>> stack.pop()
>>> stack.top()
12
Benchmarking
Here is the code that we will use to run our benchmark.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
from line_profiler import *
import module
@profile
def normal_stack():
stack = []
stack.append(80)
stack.append(90)
stack.append(100)
stack.remove(stack[0])
@profile
def cpp_stack():
stack = module.Stack()
stack.push(80)
stack.push(90)
stack.push(100)
stack.pop()
spp_stack()
normal_stack()
Let’s use the line_profiler python plugin(pip install line_profiler
) to benchmark our 2 functions:
First Case : test cases < 10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ kernprof -l -v test-speed.py
Wrote profile results to test-speed.py.lprof
Timer unit: 1e-06 s
Total time: 5e-06 s
File: test-speed.py
Function: normal_stack at line 4
Line # Hits Time Per Hit % Time Line Contents
==============================================================
4 @profile
5 def normal_stack():
6 1 1.0 1.0 20.0 stack = []
7 1 1.0 1.0 20.0 stack.append(80)
8 1 1.0 1.0 20.0 stack.append(90)
9 1 1.0 1.0 20.0 stack.append(100)
10 1 1.0 1.0 20.0 stack.remove(stack[0])
Total time: 2.4e-05 s
File: test-speed.py
Function: cpp_stack at line 12
Line # Hits Time Per Hit % Time Line Contents
==============================================================
12 @profile
13 def cpp_stack():
14 1 16.0 16.0 66.7 stack = module.Stack()
15 1 4.0 4.0 16.7 stack.push(80)
16 1 2.0 2.0 8.3 stack.push(90)
17 1 1.0 1.0 4.2 stack.push(100)
18 1 1.0 1.0 4.2 stack.pop()
As we can see, the stack we implemented ourselves is slightly faster than the normal stack that we implemented in python.
Second Case: test cases >= 1000
Let’s test our code on a more realistic test case where the number of iterations is equal to 1000.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from line_profiler import *
import module
@profile
def normal_stack():
stack = []
for i in range(1,1000):
stack.append(i)
stack.remove(stack[0])
@profile
def cpp_stack():
stack = module.Stack()
for i in range(1,1000):
stack.push(i)
stack.pop()
cpp_stack()
normal_stack()
When we compile with kernprof -l -v test-speed.py
, we get:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
$ kernprof -l -v test-speedpy
Wrote profile results to test-speed.py.lprof
Timer unit: 1e-06 s
Total time: 0.001273 s
File: test-speed.py
Function: normal_stack at line 4
Line # Hits Time Per Hit % Time Line Contents
==============================================================
4 @profile
5 def normal_stack():
6 1 1.0 1.0 0.1 stack = []
7 1000 532.0 0.5 41.8 for i in range(1,1000):
8 999 737.0 0.7 57.9 stack.append(i)
9 1 3.0 3.0 0.2 stack.remove(stack[0])
Total time: 0.00206 s
File: test-speed.py
Function: cpp_stack at line 11
Line # Hits Time Per Hit % Time Line Contents
==============================================================
11 @profile
12 def cpp_stack():
13 1 22.0 22.0 1.1 stack = module.Stack()
14 1000 617.0 0.6 30.0 for i in range(1,1000):
15 999 1419.0 1.4 68.9 stack.push(i)
16 1 2.0 2.0 0.1 stack.pop()
Wraping Up
Unfortunately, our code is slightly slower than the normal python stack because of the messy implementation of the stack data structure that I made in C++, I am sure that the code would be way faster if I had used a proper stack instead of the…uh…thing I made instead.
The time for computation is exponential per iteration as it increases by every iteration because of the implementation I made of the Push method. We can see easily in the benchmark that the stack.pop() method takes up 68% of the computation time.
Here is a better implementation of the stack:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
class stack
{
int *arr;
int top;
int capacity;
public:
stack(int size = SIZE); // constructor
~stack(); // destructor
void push(int);
int pop();
int peek();
int size();
bool isEmpty();
bool isFull();
};
// Constructor to initialize the stack
stack::stack(int size)
{
arr = new int[size];
capacity = size;
top = -1;
}
// Destructor to free memory allocated to the stack
stack::~stack() {
delete[] arr;
}
// Utility function to add an element `x` to the stack
void stack::push(int x)
{
if (isFull())
{
cout << "Overflow\nProgram Terminated\n";
exit(EXIT_FAILURE);
}
cout << "Inserting " << x << endl;
arr[++top] = x;
}
// Utility function to pop a top element from the stack
int stack::pop()
{
// check for stack underflow
if (isEmpty())
{
cout << "Underflow\nProgram Terminated\n";
exit(EXIT_FAILURE);
}
cout << "Removing " << peek() << endl;
// decrease stack size by 1 and (optionally) return the popped element
return arr[top--];
}
// Utility function to return the top element of the stack
int stack::peek()
{
if (!isEmpty()) {
return arr[top];
}
else {
exit(EXIT_FAILURE);
}
}
// Utility function to return the size of the stack
int stack::size() {
return top + 1;
}
// Utility function to check if the stack is empty or not
bool stack::isEmpty() {
return top == -1; // or return size() == 0;
}
// Utility function to check if the stack is full or not
bool stack::isFull() {
return top == capacity - 1; // or return size() == capacity;
}
That's it, if you found any typos or anything that could be improved, pleased make a PR to this github repo by going to the _posts folder and adding your change to the cplusplus-ffi-python.md file