I am trying to use boost::python::numpy::ndarray to create a multi-dimensional array in C++ and pass it to python. The problem is how to do this without having to manage the memory associated with the ndarray in C++ myself.
I am trying to use the boost::python::numpy::from_data function to create numpy array in C++. My basic understanding is that without the appropriate owner argument to the function, the responsibility of managing the memory associated with the array falls on me.
My original assumption was that the owner argument needn't be passed based on the boost page which says this about owner: "the owner of the data, in case it is not the ndarray itself."
However, I have read posts which seem to say otherwise. E.g., link says, "If you pass object() as owner argument the array should definitely own its data (and thus report OWNDATA=True) ... " and link says that the object has to be associated with an explicit destructor.
I was wondering what the correct approach is. Or is this not the intended use case for boost::python::numpy?
Yep, the documentation says that if you pass object(), the array will free the data when but this doens't work as advertised.
Here are some excerpts from my post on github https://github.com/boostorg/python/issues/97#issuecomment-519679003 which is the same issue that OP links but the answer wasn't there before. The answer doesn't come from me, just the demonstration of the answer and why the demo of the object() way not working.
The solution would be to create an object (a capsule) that owns the raw pointer and pass that to the the boost::python::numpy::ndarray::from_data() function. A capsule is a Python object that manages a pointer:
PyObject* PyCapsule_New(void *pointer, const char *name, PyCapsule_Destructor destructor)
Here is an example where I create a pretty large array. I'm going to call this function repeatedly in a Python while True: loop. With this function, you can let the loop go on all day. That's because in the loop, I assign the return value to the same variable, so at each iteration, the previously returned ndarray's refcount goes to zero and the memory gets freed.
typedef long int my_data_type;
inline void destroyManagerCObject(PyObject* self) {
auto * b = reinterpret_cast<my_data_type*>( PyCapsule_GetPointer(self, NULL) );
std::cout << "C++ : " << __PRETTY_FUNCTION__ << " delete [] " << b << std::endl;
delete [] b;
}
boost::python::numpy::ndarray get_array_that_owns_through_capsule()
{
// Change this to see how the adresses change.
unsigned int last_dim = 6000;
boost::python::object shape = boost::python::make_tuple(4, 5, last_dim);
boost::python::numpy::dtype dt = boost::python::numpy::dtype::get_builtin<my_data_type>();
auto * const data_ptr = new my_data_type[4*5*last_dim];
const size_t s = sizeof(my_data_type);
boost::python::object strides = boost::python::make_tuple(5*last_dim*s, last_dim*s, s);
for(int i = 1; i <= 4*5*last_dim; ++i){ data_ptr[i-1] = i; }
// This sets up a python object whose destruction will free data_ptr
PyObject *capsule = ::PyCapsule_New((void *)data_ptr, NULL, (PyCapsule_Destructor)&destroyManagerCObject);
boost::python::handle<> h_capsule{capsule};
boost::python::object owner_capsule{h_capsule};
std::cout << "C++ : " << __PRETTY_FUNCTION__ << "data_ptr = " << data_ptr << std::endl;
return boost::python::numpy::from_data( data_ptr, dt, shape, strides, owner_capsule);
}
BOOST_PYTHON_MODULE(interface){
.def("get_array_that_owns_through_capsule", get_array_that_owns_through_capsule)
Then in a while loop, I can call this function all day long and
import interface
import psutil
import os
def get_process_memory_usage():
process = psutil.Process(os.getpid())
return process.memory_info().rss
hundred_mb = 100000000
MEMORY_MAX = 100 * one_mb
i = 0
while True:
print("PYTHON : ---------------- While iteration ------------------- ({})".format(i))
print("PYTHON : BEFORE calling test_capsule_way()")
arr = interface.get_array_that_owns_through_default_object()
print("PYTHON : AFTER calling test_capsule_way()")
i += 1
if i % 1000 == 0:
print("PYTHON : Nb arrays created (and pretty sure not destroyed) : {}".format(i))
mem = get_process_memory_usage()
if mem > MEMORY_MAX:
print("PYTHON : Bro chill with the memory, you're using {}MB over here!".format(mem/one_mb))
quit()
print("PYTHON : ----------- End while iteration\n")
print("PYTHON : SCRIPT END")
The output of the first couple iterations of this is
PYTHON : ---------------- While iteration ------------------- (0)
PYTHON : BEFORE calling test_capsule_way()
C++ : boost::python::numpy::ndarray get_array_that_owns_through_capsule()data_ptr = 0x7fb7c9831010
PYTHON : AFTER calling test_capsule_way()
PYTHON : ----------- End while iteration
PYTHON : ---------------- While iteration ------------------- (1)
PYTHON : BEFORE calling test_capsule_way()
C++ : boost::python::numpy::ndarray get_array_that_owns_through_capsule()data_ptr = 0x7fb7c9746010
C++ : void destroyManagerCObject(PyObject*) delete [] 0x7fb7c9831010
PYTHON : AFTER calling test_capsule_way()
PYTHON : ----------- End while iteration
PYTHON : ---------------- While iteration ------------------- (2)
PYTHON : BEFORE calling test_capsule_way()
C++ : boost::python::numpy::ndarray get_array_that_owns_through_capsule()data_ptr = 0x14c9f20
C++ : void destroyManagerCObject(PyObject*) delete [] 0x7fb7c9746010
PYTHON : AFTER calling test_capsule_way()
PYTHON : ----------- End while iteration
PYTHON : ---------------- While iteration ------------------- (3)
In the issue, I also have a demonstration of how if you do this and instead of the capsule, you pass boost::python::object() as the owner parameter, the memory is not freed and the python loop will stop because of the check on the process memory : https://github.com/boostorg/python/issues/97#issuecomment-520555403
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With