Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
181 views
in Technique[技术] by (71.8m points)

python - calling dot products and linear algebra operations in Cython?

I'm trying to use dot products, matrix inversion and other basic linear algebra operations that are available in numpy from Cython. Functions like numpy.linalg.inv (inversion), numpy.dot (dot product), X.t (transpose of matrix/array). There's a large overhead to calling numpy.* from Cython functions and the rest of the function is written in Cython, so I'd like to avoid this.

If I assume users have numpy installed, is there a way to do something like:

#include "numpy/npy_math.h"

as an extern, and call these functions? Or alternatively call BLAS directly (or whatever it is that numpy calls for these core operations)?

To give an example, imagine you have a function in Cython that does many things and in the end needs to make a computation involving dot products and matrix inverses:

cdef myfunc(...):
  # ... do many things faster than Python could
  # ...
  # compute one value using dot products and inv
  # without using 
  #   import numpy as np 
  #   np.*
  val = gammaln(sum(v)) - sum(gammaln(v)) + dot((v - 1).T, log(x).T)

how can this be done? If there's a library that implements these in Cython already, I can also use that, but have not found anything. Even if those procedures are less optimized than BLAS directly, not having the overhead of calling numpy Python module from Cython will still make things overall faster.

Example functions I'd like to call:

  • dot product (np.dot)
  • matrix inversion (np.linalg.inv)
  • matrix multiplication
  • taking transpose (equivalent of x.T in numpy)
  • gammaln function (like scipy.gammaln equivalent, which should be available in C)

I realize as it says on numpy mailing list (https://groups.google.com/forum/?fromgroups=#!topic/cython-users/XZjMVSIQnTE) that if you call these functions on large matrices, there is no point in doing it from Cython, since calling it from numpy will just result in the majority of the time spent in the optimized C code that numpy calls. However, in my case, I have many calls to these linear algebra operations on small matrices -- in that case, the overhead introduced by repeatedly going from Cython back to numpy and back to Cython will far outweigh the time spent actually computing the operation from BLAS. Therefore, I'd like to keep everything at the C/Cython level for these simple operations and not go through python.

I'd prefer not to go through GSL, since that adds another dependency and since it's unclear if GSL is actively maintained. Since I'm assuming users of the code already have scipy/numpy installed, I can safely assume that they have all the associated C code that goes along with these libraries, so I just want to be able to tap into that code and call it from Cython.

edit: I found a library that wraps BLAS in Cython (https://github.com/tokyo/tokyo) which is close but not what I'm looking for. I'd like to call the numpy/scipy C functions directly (I'm assuming the user has these installed.)

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Calling BLAS bundled with Scipy is "fairly" straightforward, here's one example for calling DGEMM to compute matrix multiplication: https://gist.github.com/pv/5437087 Note that BLAS and LAPACK expect all arrays to be Fortran-contiguous (modulo the lda/b/c parameters), hence order="F" and double[::1,:] which are required for correct functioning.

Computing inverses can be similarly done by applying the LAPACK function dgesv on the identity matrix. For the signature, see here. All this requires dropping down to rather low-level coding, you need to allocate temporary work arrays yourself etc etc. --- however these can be encapsulated into your own convenience functions, or just reuse the code from tokyo by replacing the lib_* functions with function pointers obtained from Scipy in the above way.

If you use Cython's memoryview syntax (double[::1,:]) you transpose is the same x.T as usual. Alternatively, you can compute the transpose by writing a function of your own that swaps elements of the array across the diagonal. Numpy doesn't actually contain this operation, x.T only changes the strides of the array and doesn't move the data around.

It would probably be possible to rewrite the tokyo module to use the BLAS/LAPACK exported by Scipy and bundle it in scipy.linalg, so that you could just do from scipy.linalg.blas cimport dgemm. Pull requests are accepted if someone wants to get down to it.


As you can see, it all boils down to passing function pointers around. As alluded to above, Cython does in fact provide its own protocol for exchanging function pointers. For an example, consider from scipy.spatial import qhull; print(qhull.__pyx_capi__) --- those functions could be accessed via from scipy.spatial.qhull cimport XXXX in Cython (they're private though, so don't do that).

However, at the present, scipy.special does not offer this C-API. It would however in fact be quite simple to provide it, given that the interface module in scipy.special is written in Cython.

I don't think there is at the moment any sane and portable way to access the function doing the heavy lifting for gamln, (although you could snoop around the UFunc object, but that's not a sane solution :), so at the moment it's probably best to just grab the relevant part of source code from scipy.special and bundle it with your project, or use e.g. GSL.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...