python: 行列演算の高速化
こことここにinspireされてnumbaを試した.
結論から言うと超早い.ごりごりの数値計算をするなら絶対使うべき.
Installation
Ubuntu14.04でanacondaは使わない
llvm3.5 / llvmlite
sudo apt-get install -y llvm3.5* sudo apt-get install -y libedit-dev enum cd /usr/bin/ sudo ln -s llvm-config-3.5 llvm-config sudo ldconfig git clone https://github.com/numba/llvmlite cd llvmlite sudo python setup.py install
numba
git clone https://github.com/numba/numba.git cd numba sudo pip install -r requirements.txt python setup.py build_ext --inplace sudo pip install funcsigs sudo python setup.py install
error
File "/usr/local/lib/python2.7/dist-packages/numba/compiler.py", line 199, in run 'Module' object has no attribute 'get_global'
なエラーはllvmlite, numbaともに
pip uninstall
してインストールし直す
Installation (ubuntu14.04 with llvm3.7)
Get signature
wget -O - http://apt.llvm.org/llvm-snapshot.gpg.key|sudo apt-key add -
Add source list
# 3.7
deb http://apt.llvm.org/trusty/ llvm-toolchain-trusty-3.7 main
deb-src http://apt.llvm.org/trusty/ llvm-toolchain-trusty-3.7 main
update and install lmvlm
sudo apt-get update sudo apt-get install "llvm-3.7*"
llvmlite installation
git clone https://github.com/numba/llvmlite.git sudo su cd llvmlite export LLVM_CONFIG=/usr/bin/llvm-config-3.7 python setup.py install
numba installation
git clone https://github.com/numba/numba.git sudo su cd numba python setup.py build_ext --inplace python setup.py install
Affine速度比較
環境
- Ubuntu14.04
- CPU: Intel(R) Core(TM) i5-2540M CPU @ 2.60GHz
affine.py
#!/usr/bin/env python import numpy as np import time from numba import jit # dataset n = 10000 d = 1000 X_ = np.random.rand(n, d) y_ = np.random.rand(d) b_ = np.random.rand(n) gamma_ = 0.01 max_itr = 1000 # numpy st0 = time.time() for i in xrange(0, max_itr): X_.dot(y_) * gamma_ + b_ pass et0 = time.time() print "elapsed time (numpy): %f [s]" % (et0 - st0) # numba @jit def affine(X, y, gamma, b): """ """ n = X.shape[0] d = X.shape[1] # TOO BAD so to stick to direct manipulation #for i in xrange(0, max_itr): # X_.dot(y_) * gamma_ + b_ # pass for i in xrange(0, max_itr): for i in range(n): for j in range(d): X[i, j] * y[j] * gamma_ + b_[i] pass pass st0 = time.time() affine(X_, y_, gamma_, b_) et0 = time.time() print "elapsed time (numba): %f [s]" % (et0 - st0) # numba (type-specified) @jit("(f8[:, :], f8[:], f8, f8[:])") def affine_type_specified(X, y, gamma, b): """ """ n = X.shape[0] d = X.shape[1] # TOO BAD so to stick to direct manipulation #for i in xrange(0, max_itr): # X_.dot(y_) * gamma_ + b_ # pass for i in xrange(0, max_itr): for i in range(n): for j in range(d): X[i, j] * y[j] * gamma_ + b_[i] pass pass st0 = time.time() affine_type_specified(X_, y_, gamma_, b_) et0 = time.time() print "elapsed time (numba type-specified): %f [s]" % (et0 - st0)
結果
elapsed time (numpy): 15.048479 [s] elapsed time (numba): 0.414315 [s] elapsed time (numba type-specified): 0.000007 [s]
numpyは使うが,愚直にloopで書いてjitでdecorateする.
(行列の(n, d)-次元数を減らすと,numba-typed > numpy > numbaだった)