fast data conversion from maya to numpy

Sometimes we need to get data from a DCC app, in this case Maya and then use it in a different format to process it. If we do it in the standard python way with a list comprehension or a generator this operation can get slow on large data sets. It is much easier in c++ where we can pass memory addresses directly and cast the data in a much better way.

lets say we need to get all points from a mesh. Maya old API makes it actually easy in this case as it provides getRawPoints function which returns a swig object which is basically a memory address for the data. So not only this is very fast but gives us a pointer and this is exactly what we need.

from maya import OpenMaya # old maya API
from ctypes import c_float

raw_points = mfn_mesh.getRawPoints()
point_count = mfn_mesh.numVertices()

# the data returned from maya is an array of floats (x,y,z)
c_float_array = (c_float * point_count * 3).from_address(int(raw_points))
# we can use numpy to convert out c array to it
numpy_points = numpy.ctypeslib.as_array(c_float_array)

for a cube we end up with array like this:

array([[-0.5, -0.5,  0.5,  0.5, -0.5,  0.5, -0.5,  0.5],
       [ 0.5,  0.5,  0.5,  0.5, -0.5,  0.5, -0.5,  0.5],
       [ 0.5, -0.5, -0.5, -0.5, -0.5,  0.5, -0.5, -0.5]], dtype=float32)

once we have data in numpy we can operate on it in a very efficient way and we can also cast it to other format like USD data:

usd_points = Vt.Vec3fArray.FromNumpy(numpy_points)

that was not too hard and it is very fast to execute even on millions of points.
this problem is this is a one off solution as maya doesn’t provide raw access to any other data structures.

Lets see if we can do something similar using maya API v2

from maya.api import OpenMaya # new maya api v2

points = mfn_mesh_v2.getPoints()
numpy_points = numpy.array(points).T #[(x,y,z,w),(x,y,z,w)] -> [(x,x),(y,y),(z,z),(w,w)]
_, length = numpy_points.shape
numpy_points = numpy_points.copy()
numpy_points.resize(3, length) #[(x,x),(y,y),(z,z),(w,w)] -> [(x,x),(y,y),(z,z)]

the code is definitely more pythonic and it looks good but unfortunately this is 500x slower. it seems that under the hood numpy.array(points) iterates through all the points and this takes time. I would love to find a way to cast the memory address directly here. The extra steps here are necessary only if you want to end up with (x,y,z) like in the case of rawPoints. API v2 returns each point as (x,y,z,w) and the resize operation simply ditches the w from it. We also transposed the array to match the format of rawPoints.

# direct cast of a cube to numpy array before it was transposed and resized
# it makes more sense than the array before which was split into
# all X, all Y, all Z vectors.

array([[-0.5, -0.5,  0.5,  1. ],
       [ 0.5, -0.5,  0.5,  1. ],
       [-0.5,  0.5,  0.5,  1. ],
       [ 0.5,  0.5,  0.5,  1. ],
       [-0.5,  0.5, -0.5,  1. ],
       [ 0.5,  0.5, -0.5,  1. ],
       [-0.5, -0.5, -0.5,  1. ],
       [ 0.5, -0.5, -0.5,  1. ]])

What is a bit weird here is that we need to copy the numpy array to resize it as apparently we do not own the data here. So why the creation of the array is so slow? It suggests that a new array was made.

back to Maya API v1 that comes with utilities to create swig objects.

when maya doesn’t provide convenient method to get the memory address of the data we need to construct it by using OpenMaya.MScriptUtil.

from maya import OpenMaya # API v1
from ctypes import c_float
import numpy

points = OpenMaya.MPointArray()
mfn_mesh.getPoints(points, OpenMaya.MSpace.kObject)

num_points = points.length()
array_size = (num_points * 4) # 4 floats per point (x,y,z,w)
util = OpenMaya.MScriptUtil()
util.createFromList([float()] * array_size, array_size)
ptr = OpenMaya.MScriptUtil.asFloat4Ptr(util)
points.get(ptr) # copy points to ptr

c_float_array = ((c_float * 4) * num_points).from_address(int(ptr)) # x,y,z,w per point
np_array = numpy.ctypeslib.as_array(c_float_array ).T #[(x,y,z,w),(x,y,z,w)] -> [(x,x),(y,y),(z,z),(w,w)]
np_array = np_array.copy()
np_array.resize(3, num_points) #[(x,x),(y,y),(z,z),(w,w)] -> [(x,x),(y,y),(z,z)]
print(np_array.shape)

this is now much more involved and not very pythonic. the end result is the same and it is much faster then API v2. This example is more general and can be used to get any data from maya.