KZKY memo

自分用メモ.

python: 行列演算の高速化

ここここにinspireされてnumbaを試した.
結論から言うと超早い.ごりごりの数値計算をするなら絶対使うべき.

Installation

Ubuntu14.04でanacondaは使わない

llvm3.5 / llvmlite
sudo apt-get install -y llvm3.5*
sudo apt-get install -y libedit-dev enum

cd /usr/bin/
sudo ln -s llvm-config-3.5 llvm-config
sudo ldconfig

git clone https://github.com/numba/llvmlite
cd llvmlite
sudo python setup.py install
numba
git clone https://github.com/numba/numba.git
cd numba
sudo pip install -r requirements.txt
python setup.py build_ext --inplace
sudo pip install funcsigs
sudo python setup.py install
error
File "/usr/local/lib/python2.7/dist-packages/numba/compiler.py", line 199, in run
'Module' object has no attribute 'get_global'

なエラーはllvmlite, numbaともに

pip uninstall 

してインストールし直す

Installation (ubuntu14.04 with llvm3.7)

Get signature
wget -O - http://apt.llvm.org/llvm-snapshot.gpg.key|sudo apt-key add -
Add source list
# 3.7 
deb http://apt.llvm.org/trusty/ llvm-toolchain-trusty-3.7 main
deb-src http://apt.llvm.org/trusty/ llvm-toolchain-trusty-3.7 main
update and install lmvlm
sudo apt-get update
sudo apt-get install "llvm-3.7*"
llvmlite installation
git clone https://github.com/numba/llvmlite.git
sudo su
cd llvmlite
export LLVM_CONFIG=/usr/bin/llvm-config-3.7
python setup.py install
numba installation
git clone https://github.com/numba/numba.git
sudo su
cd numba
python setup.py build_ext --inplace
python setup.py install

Getting Started

ここここ

Affine速度比較

環境

  • Ubuntu14.04
  • CPU: Intel(R) Core(TM) i5-2540M CPU @ 2.60GHz

affine.py

#!/usr/bin/env python

import numpy as np
import time

from numba import jit

# dataset
n = 10000
d = 1000
X_ = np.random.rand(n, d)
y_ = np.random.rand(d)
b_ = np.random.rand(n)
gamma_ = 0.01
max_itr = 1000

# numpy
st0 = time.time()
for i in xrange(0, max_itr):
    X_.dot(y_) * gamma_ + b_
    pass
et0 = time.time()
print "elapsed time (numpy): %f [s]" % (et0 - st0)


# numba
@jit
def affine(X, y, gamma, b):
    """
    """
    n = X.shape[0]
    d = X.shape[1]

    # TOO BAD so to stick to direct manipulation
    #for i in xrange(0, max_itr):
    #    X_.dot(y_) * gamma_ + b_
    #    pass

    for i in xrange(0, max_itr):
        for i in range(n):
            for j in range(d):
                X[i, j] * y[j] * gamma_ + b_[i]
                pass
            pass
            
st0 = time.time()
affine(X_, y_, gamma_, b_)
et0 = time.time()
print "elapsed time (numba): %f [s]" % (et0 - st0)

# numba (type-specified)
@jit("(f8[:, :], f8[:], f8, f8[:])")
def affine_type_specified(X, y, gamma, b):
    """
    """
    n = X.shape[0]
    d = X.shape[1]

    # TOO BAD so to stick to direct manipulation
    #for i in xrange(0, max_itr):
    #    X_.dot(y_) * gamma_ + b_
    #    pass

    for i in xrange(0, max_itr):
        for i in range(n):
            for j in range(d):
                X[i, j] * y[j] * gamma_ + b_[i]
                pass
            pass
            
st0 = time.time()
affine_type_specified(X_, y_, gamma_, b_)
et0 = time.time()
print "elapsed time (numba type-specified): %f [s]" % (et0 - st0)

結果

elapsed time (numpy): 15.048479 [s]
elapsed time (numba): 0.414315 [s]
elapsed time (numba type-specified): 0.000007 [s]

numpyは使うが,愚直にloopで書いてjitでdecorateする.
(行列の(n, d)-次元数を減らすと,numba-typed > numpy > numbaだった)