KZKY memo

自分用メモ.

python: multiprocessing 2

前回zipfileのunzipをpythonのmultithreadingで行って全然早くなっていないことを確認したが,

  • image 2 ndarray
  • pickle 2 ndarray

だとどうなるのが調べてみた.
pickleはndarray.dumpしたもの.

実験設定と環境

  • OS
    • Ubuntu14.04
  • CPU
    • Intel(R) Core(TM) i5-2540M CPU @ 2.60GHz
  • Disk
    • Hitachi HTS54756
  • FS
  • Library
    • scipy: 0.13.3
    • opencv: 2.4.8
    • pil : 1.1.7
  • Data
    • cifer10 (obtained from Kaggle so that it is jpeg)
    • 50000 images
    • 総サイズ: 197M
    • 平均サイズ: 3.94 K

実験内容

実験1
  • scipy, PIL, opencv2のimreadの違いを見る
  • それぞれのimage2ndarrayの時間を計測

実験2

  • 実験1 で一番速度が早かったLibを選ぶ
  • multithreadにした時の速度を計測
    • image2ndarrayとpickle2ndarrayの2つの場合の速度差をみる

実験1

コード

  • comp_pil_imread.py
#!/usr/bin/env python

"""
To check if diffrence in speed exists.
"""

from PIL import Image
from scipy.misc import imread
import cv2
import numpy as np
import glob
import time

@profile
def read_with_pil():
    """
    """

    base_dirpath = "/home/kzk/datasets/cifar10/train"
    filepaths = glob.glob("{}/*".format(base_dirpath))
    for i, filepath in enumerate(filepaths):
        if i % 1000 == 0:
            print i
        I = np.asarray(Image.open(filepath))
        
def main():
    read_with_pil()
    pass

if __name__ == '__main__':
    main()
  • comp_cv2_imread.py
#!/usr/bin/env python

"""
To check if diffrence in speed exists.
"""

from PIL import Image
from scipy.misc import imread
import cv2
import numpy as np
import glob
import time

@profile
def read_with_cv2():
    """
    """

    base_dirpath = "/home/kzk/datasets/cifar10/train"
    filepaths = glob.glob("{}/*".format(base_dirpath))
    
    for i, filepath in enumerate(filepaths):
        if i % 1000 == 0:
            print i
        I = cv2.imread(filepath)
        
def main():
    read_with_cv2()
    pass

if __name__ == '__main__':
    main()
  • comp_scipy_imread.py
#!/usr/bin/env python

"""
To check if diffrence in speed exists.
"""

from PIL import Image
from scipy.misc import imread
import cv2
import numpy as np
import glob
import time

@profile
def read_with_scipy():
    """
    """

    base_dirpath = "/home/kzk/datasets/cifar10/train"
    filepaths = glob.glob("{}/*".format(base_dirpath))
    for i, filepath in enumerate(filepaths):
        if i % 1000 == 0:
            print i
        I = imread(filepath)
        
def main():
    read_with_scipy()
    pass

if __name__ == '__main__':
    main()
  • comp_imread.sh
    • 実行 shell script
#!/bin/sh

# scipy (457.126190)
echo 1 > /proc/sys/vm/drop_caches
kernprof -l comp_scipy_imread.py

# pil (457.224693)
echo 1 > /proc/sys/vm/drop_caches
kernprof -l comp_pil_imread.py

# opencv (441.201759)
echo 1 > /proc/sys/vm/drop_caches
kernprof -l comp_cv2_imread.py

disk cacheを毎回消している.

結果

line_profilerで実行した決結果の内,該当行のみを記録した.

method sec sec/file
scipy 457.13 0.0091426
pil 457.22 0.0091444
opencv 441.20 0.008824

特別速度に違いはなしどれもコアは同じなんだろうか?

次のファイルをみると

  • /usr/local/lib/python2.7/dist-packages/scipy-0.13.3-py2.7-linux-x86_64.egg/scipy/ndimage/io.py
...
def imread(fname, flatten=False, mode=None):
    """
    Load an image from file.

    Parameters
...

PIL.Imageを使っていることがわかる.

opencvはというと

  • /usr/lib/python2.7/dist-packages/cv.py
from cv2.cv import *

cv2はcv.soなのでおそらくOpenCV自身(c/c++で書かれた)のimreadのpython interfaceを使っていると思われる.

取り敢えず一番早かったopencvを選ぶ.

実験2

  • threading数は3で行う.

コード

  • thread_sample_imread.py
#!/usr/bin/python
# -*- coding: utf-8 -*-

# url
## http://ja.pymotw.com/2/threading/
## http://docs.python.jp/2.7/library/threading.html

import threading
import glob
import time
import Queue
import cv2
import pickle

# http://docs.python.jp/2/library/queue.html#module-Queue

# read jpeg to ndarray
def read_jpeg_2_ndarray(filepath):
    return cv2.imread(filepath)

# queue
queue = Queue.Queue()

# wokrer
def worker():
    """
    """
    while True:
        path = queue.get()
        read_jpeg_2_ndarray(path)
        queue.task_done()
        pass
    pass

# put task
base_dirpath = "/home/kzk/datasets/cifar10/train"
filepaths = glob.glob("{}/*".format(base_dirpath))
for filepath in filepaths:
    queue.put(filepath)
    pass

# create/start  thread
st = time.time()
num_worker_threads = 3
for i in range(num_worker_threads):
    t = threading.Thread(target=worker)
    t.daemon = True
    t.start()
    pass

queue.join()  # wait
et = time.time()
print "total execution time with threading: ", (et - st), "[s]"
  • thread_sample_pklread.py
#!/usr/bin/python
# -*- coding: utf-8 -*-

# url
## http://ja.pymotw.com/2/threading/
## http://docs.python.jp/2.7/library/threading.html

import threading
import glob
import time
import Queue
import cv2
import pickle

# http://docs.python.jp/2/library/queue.html#module-Queue

# read pickle to ndarray
def read_pkl_2_ndarray(filepath):
    
    return pickle.load(open(filepath, "r"))

# queue
queue = Queue.Queue()

# wokrer
def worker():
    """
    """
    while True:
        path = queue.get()
        read_pkl_2_ndarray(path)
        queue.task_done()
        pass
    pass

# put task
base_dirpath = "/home/kzk/datasets/cifar10/train_pkl"
filepaths = glob.glob("{}/*".format(base_dirpath))
for filepath in filepaths:
    queue.put(filepath)
    pass

# create/start  thread
st = time.time()
num_worker_threads = 3
for i in range(num_worker_threads):
    t = threading.Thread(target=worker)
    t.daemon = True
    t.start()
    pass

queue.join()  # wait
et = time.time()
print "total execution time with threading: ", (et - st), "[s]"

結果

cv2.imread pickle cv2.imread (disk cached) pickle (disk cached)
332.17 358.31 3.43 21.45

読み込みスピード関して,
image (jpeg) 2 ndarray > pickle 2 ndarray
になったが,誤差な気がする(これ以上追わない).

キャッシュしている場合だと
image (jpeg) 2 ndarray > pickle 2 ndarray
なった.
kaggleからcifar10を持ってきて,trainを使えば同じ様な結果はでると思う.

multithreadingスピードアップに関しては

441.201759/332.17 =~ 1.328240837522955

1.3倍くらいか...