Tensor Flow: How To

この記事を書いている時点では，0.6.0が最新なので，それを参考にまとめている．

Variables: Creation, Initializing, Saving, and Restoring

Variableはin-memory buffferだからtrainingが終わっったら，永続化させてevalutionとかしたい．

The tf.Variable class
The tf.train.Saver class

の説明

Creation

# Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
                      name="weights")
biases = tf.Variable(tf.zeros([200]), name="biases")

こんな感じ．

shapeはtupleでなくてlistで渡している．
tf.Varibalbeは，以下のopsをGraphに追加する

varialbe op
initializer op. これが，実際にinit valを与える．実際はtf.assign op
initial valueに対するop. 上記の例だと，random_normal, zeros

Initialization

opする前に，varialbeをinitializeしろという話．
一番簡単なのは，tf.initialize_all_variables().
varialbe valueは，checkpointからも復帰可能．

こんな感じ

import tensorflow as tf 

# Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
                      name="weights")
biases = tf.Variable(tf.zeros([200]), name="biases")
...
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()

# Later, when launching the model
with tf.Session() as sess:
  # Run the init operation.
  sess.run(init_op)
  ...
  # Use the model
  ...

init_opが返ってくるので，sessionの一番初めで，sess.run(init_op)する

Initialization from another Variable

tf.initialize_all_variables()

はパラレルですべての変数を初期値化するので注意すること．

他のVariableの初期値からも初期値化可能

import tensorflow as tf 

sess = tf.InteractiveSession()

# Create a variable with a random value.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
                      name="weights")
# Create another variable with the same value as 'weights'.
w2 = tf.Variable(weights.initialized_value(), name="w2")
# Create another variable with twice the value of 'weights'
w_twice = tf.Variable(weights.initialized_value() * 2.0, name="w_twice")

初期値の同じであることを確認．

# First w2, then depencency, weights is also initialized
w2.initializer.run()
w2.eval()
weights.eval()

Custom Initialization

tf.initialize_all_variables()はすべてのvariablesをinitializeするが，一部をinitすることも可能．

Saving and Restoring

tf.train.Saverはグラフに対して，2つのopを追加する．

save
restore

graph上，すべてのvariableに対してでなく，一部でもOK.

Checkpoint Files

Variablesはバイナリで保存されるけれど，大雑把には，map of Varialbe name to Tensor.

Saving Variables

こんな感じで，/tmp/model.ckptにv1, v2が保存される

saver.py

import tensorflow as tf
import numpy as np

# Create some variables.

v1 = tf.Variable(np.random.rand(10, 5), name="v1")
v2 = tf.Variable(np.random.rand(5, 3), name="v2")
prod = tf.matmul(v1, v2)

# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()

# Add ops to save and restore all the variables.
saver = tf.train.Saver()

# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
    sess.run(init_op)

    # Run prod
    result = sess.run(prod)
    print result

    # Save the variables to disk.
    save_path = saver.save(sess, "/tmp/model.ckpt")
    print "Model saved in file: ", save_path

Restoring Variables

restoreするときは，init variablesはいらない．

こんな感じ

restore.py

import tensorflow as tf
import numpy as np

v1 = tf.Variable(np.random.rand(10, 5), name="v1")
v2 = tf.Variable(np.random.rand(5, 3), name="v2")
prod = tf.matmul(v1, v2)

# Add ops to save and restore all the variables.
saver = tf.train.Saver()

# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
    # Restore variables from disk.
    saver.restore(sess, "/tmp/model.ckpt")
    print "Model restored."

    result = sess.run(prod)
    print result #結果が同じになる

元のvarialbes, operationsをしっていないと復帰できないという認識で良いのだろうか?

Choosing which Variables to Save and Restore

key: value = "variable name": variable

のdictをSaverに渡してあげると，

部分グラフをsaveできる
restoreしたときに指定された名前を使用できる

このdictを指定しないとgraph全体を保存する．

複数Saverも使用できて，複数Saverで同じVarialbeが使われているときは，restoreした時にのみvalueが変更される．

TensorFlow Mechanics 101

mnistを使ったMLPのhello world的なもの．

git clone https://github.com/tensorflow/tensorflow.git

で持ってきてしまった方が早い．

./tensorflow/tensorflow/examples/tutorials/mnist

にサンプルコードがある．
ここでは，Convolutionは使っていないので注意．MLP.

正直，サンプルコードを見たてデバッガで追ったり，APIリファレンスを見たほうが理解が早い．

Prepare the Data

Download

コードの中にDLが入っているので，特に考えなくていい．
training, validation, test datasetsの内約は

Dataset	# Samples
training	55000
validation	5000
test datasets	10000

Inputs and Placeholders

Varialbeでなくて，feeds (placeholder)を使っている．

images_placeholder = tf.placeholder(tf.float32, shape=(batch_size,
                                                       IMAGE_PIXELS))
labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size))

な感じで，tensorの次元が，(batch_size, height*width)になっている．
これは，もとのYann Lecun mnist dataが1次元テンソルだから．minst sampleではちゃんと，別箇になっている．

Build the Graph

3 stepで行っている．

inference(): forward propのgraph, opsをつくる
loss(): 誤差計算のopsをinferenceの結果に追加
training(): optimizationのopsをloss graphに追加する

分類問題に置いては，これがコンベンション．

inferenceの結果を利用してlossおよびevalutionをすれば良い．

mnist.pyに書いてあるメソッドが分類問題における一般的な書き方だろう．

Inference

mnist.pyに書いてある内容の説明．

with tf.name_scope('hidden1') as scope:

で，この中で作成されるvariablesのprefixに"hidden1"がついて, 例えば，"hidden1/weights"のようになる

name scopeの中はこんな感じで書く．

weights = tf.Variable(
    tf.truncated_normal([IMAGE_PIXELS, hidden1_units],
                        stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))),
    name='weights')
biases = tf.Variable(tf.zeros([hidden1_units]),
                     name='biases')

hidden1 = tf.nn.relu(tf.matmul(images, weights) + biases)

基本weightのinitializationは，tf.truncated_normalでやるらしい．

これを3回繰り返して最後に，logitsを計算(linear sofmax)する．softmaxは，Lossメソッドの中で適用されている．

logits = tf.matmul(hidden2, weights) + biases

softmaxは次のlossで掛けられている．

Loss

分類問題なので，ラベルを1-hot表現にしている．例えば，
label=3で，クラス数が10なら，[0, 0, 0, 1, 0, 0, 0, 0, 0, 0]に変換される．

こんな感じ

batch_size = tf.size(labels)
labels = tf.expand_dims(labels, 1)
indices = tf.expand_dims(tf.range(0, batch_size, 1), 1)
concated = tf.concat(1, [indices, labels])
onehot_labels = tf.sparse_to_dense(
    concated, tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)

NUM_CLASSESが入っているので，予めクラス数を知らないとイケない．

tensorflowの枠組みてやっているので，わかりにくいが，別に自分でやってもいい．最終的に，[batch, num_classes]の次元でone-hotになっていればよく，それがTensor Objectでwrapされていて，logitsとcross-entropyをとれれば良い．

最後に，softmaxを適用して，cross entropyを取って，batchs間で平均をとる．

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
                                                        onehot_labels,
                                                        name='xentropy')

loss = tf.reduce_mean(cross_entropy, name='xentropy_mean')

Training

lossを受け取ってoptimizerにして，minimizeする．
conventionとして覚えておくといい．

optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate)
global_step = tf.Variable(0, name='global_step', trainable=False)
train_op = optimizer.minimize(loss, global_step=global_step)

Train the Model

fully_connected_feed.pyに書いてある内容の説明

The Graph

with tf.Graph().as_default():

でopsをgroupとして実行する．graphはopsの集まり．普通はgraphが1つ十分だけれども，複数のgraphの実行も可能．

The Session

必要なopsのbuildが終わったらsessionを作ってrunする．
基本は，初めにinitすること．

sess = tf.Session()
init_op = tf.initialize_all_variables()
sess.run(init_op)

tf.Sessionが引数なしだと，codeは，default local sessionに追加される．

Train Loop

trainする. trainingをコントロールしたかったら，このループで行えばい．

for step in xrange(max_steps):
    sess.run(train_op)

*** Feed the Graph

このサンプルでは，feedでデータを食わしている．
theano.function()の使い方に似ている．updates argumentがopになった感じ．

*** Check the Status

for step in xrange(FLAGS.max_steps):
    feed_dict = fill_feed_dict(data_sets.train,
                               images_placeholder,
                               labels_placeholder)
    _, loss_value = sess.run([train_op, loss],
                             feed_dict=feed_dict)

train_opはOperationなので，なにも値を返さない，lossは値を返すのでそれを受け取る．fetchでは，inputがlistだと，tuple of np.arrayで返ってくる.

*** Visualize the Status

Tensorboardでvisualizeするために，

summary_op
SummaryWriter

を作る．
SummaryWriterはCheckpointを設けるSaverとは別で，Visualize用なので注意．

Graphを構築する過程で，

summary_op = tf.merge_all_summaries()

で1つのsummary_opを作る．

sessionの中で，

summary_writer = tf.train.SummaryWriter(FLAGS.train_dir,
                                        graph_def=sess.graph_def)

SummaryWriterを作る．

そして，summary_opを実行して，結果をwriterに　addする．

summary_str = sess.run(summary_op, feed_dict=feed_dict)
summary_writer.add_summary(summary_str, step)

*** Save a Checkpoint

モデルをrestoreするのにcheckpointを設けているという話．

saver.save(sess, FLAGS.train_dir, global_step=step)

Evaluate the Model

Build the Eval Graph

Eval Output

特に説明は不要．

eval_correct = tf.nn.in_top_k(logits, labels, 1)

でtop_kの予測値の中に正解があったら1とするというopがあってそれを使っているくらい．labelsはTensor objectで，batch_sizeのclass index array.

TensorBoard: Visualizing Learning

パラメータの学習過程，目的関数の時間的推移などを可視化するツール．可視化するデータは，Summary protobuf objectでシリアライズされて，SumaryWriter経由でディスクに吐出される．TensorBoardはそれを読み込むpythonのhttp server. 特にWAFを使っているわけではなくて，BaseHTTPServer.HTTPServerを拡張してるだけ．

See

/usr/local/lib/python2.7/dist-packages/tensorflow/tensorboard/tensorboard.py

基本的な使い方

可視化したいノードにsummary opをつける
- 例えば誤差関数の値を見たいなら
- tf.scalar_summary(loss.op.name, loss)
- summary opはgraphの付随品
- 実行しているopはsummary nodeに何も依存しないので，summayr opは別途実行する
tf.merge_all_summariesで全部一緒のopにする
- summary nodeは別途実行する必要があるが，一つ一つ実行するのは面倒なのでこうする
summary_writerを作る
- graphを可視化したい場合は，GrahpDefを引数に入れておく
summary opを実行した結果 (summary_str)をsummary_writer.add_sumary(summary_str, step)する
- 毎回実行より，n-step毎に実行したほうがいい

公式のサンプルコード

やっていることは，基本的な使い方の通り.

TensorBoard: Graph Visualization

Summaryをディスクに吐いたら，指定したディレクトリを引数にして，tensorboardを立ち上げる．

python tensorflow/tensorboard/tensorboard.py --logdir=path/to/log-directory

いい感じのグラフが出てくる．

Name scoping and nodes

The better your name scopes, the better your visualization

は覚えておいたほうがいい．

こんな感じで，scopeを書いておくと

import tensorflow as tf

with tf.name_scope('hidden') as scope:
  a = tf.constant(5, name='alpha')
  W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0), name='weights')
  b = tf.Variable(tf.zeros([1]), name='biases')

可視化した時に，hiddenと表示されて，クリックすると詳細が見れる．

Interaction

実際にtensorboardを動かして，実際のgraphを見たほうが早い．

アイコンの説明は，ここ

ノードの色に関しては，Structure viewとDevice viewがあるくらいは覚えておいたほうがいい．

Reading Data

データの食わせ方は3つある

Feeding: step毎にpthon コードでデータを食わせる. placeholderを使う方法．
Reading from files: インプットパイプラインが，graphの初めに，ファイルからデータを読み込む．
Preloaded data: Constant or Variableがすべてのデータを保持する．小さいデータ向け．基本はGPUで計算する前提だと思うので，GPUメモリに収まるくらい．

Feeding

placeholderを使うデータの食わせ方は，

を見るのが早い．

fully_connected_feed.pyのfill_feed_dict関数．

Reading from files

基本的なパイプライン

ファイル名のリスト
リストのシャッフル (Optional)
epochにリミットを書ける (Optional)
Filename queue
ファイルフォーマットに対するReader
ファイルのレコードに対するDecoder
前処理 (Optional)
Example queue

Filenames, shuffling, and epoch limits

基本はデータがあるファイルのリストがあって，それらを読み込むためのFIFO Queueを作るまでの過程．

ファイルリストは，

Tensor (like ["file0", "file1"] or [("file%d" % i) for i in range(2)]) でTensor of listにするか
tf.train.match_filenames_onceでパターン(glob)を渡す

それをtf.train.string_input_producerに渡す．

tf.train.string_input_producerの引数として，shuffle, max_epochsがある．shuffle=Trueだとepoch毎にファイルリストをshuffleする．Queue runnerは，1 epochで1回だけファイルリストを全部，queueに入れる．shuffleなので，uniform sampling (under-/over-samplingではない)．これは，reader (filenameをqueueから読み込む)と別スレッドで動くので，readerをblockはしない．

File formats

インプットファイル形式に合わせたReaderを使って，filename queueをそのReaderのreadメソッドに渡す．readメソッドは，fileとrecordを識別するkeyとrecordのstring valueを返す．

*** CSV files

csv file を読むときは，tf.TextLineReaderクラスとtf.decode_csv を使う．
TextLineReader.readは1行を読み込んでkey, valueを返すので，decode_csv op にvalueとrecord_defaultsを渡す．decode_csv opはlist of Tensorを返す．reacord_defaultはlist of Tensorのタイプとvalueに値がはいってこなかったときのdefault値となる．

サンプルを見る感じだと，1行が1(サンプル,ラベル)でサンプルはベクトルで表現されている場合の至極一般的なフォーマットを扱うときに使用する．

取り敢えずiris.dataを持ってきてサンプルを動かす．

あやめを3つコピーしてから，サンプルを動かしてみる．

サンプル

import tensorflow as tf

filename_queue = tf.train.string_input_producer(["iris0.csv", "iris1.csv", "iris2.csv"])
                                                
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[1.0], [1.0], [1.0], [1.0], [""]]
col1, col2, col3, col4, col5 = tf.decode_csv(
    value, record_defaults=record_defaults)

#features = tf.concat(0, [col1, col2, col3, col4])
features = tf.pack([col1, col2, col3, col4])

with tf.Session() as sess:
    # Start populating the filename queue.
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    for i in range(1200):
        # Retrieve a single instance:
        example, label = sess.run([features, col5])
        print "Example: ", example
        print "Label: ", label

    coord.request_stop()
    coord.join(threads)

tf.concatだとcolが'TensorShape([])'，すなわちscalarなので，concatできないよう．なので，tf.packを使っている．

*** Fixed length records

バイナリファイルで，1 recordが固定長の場合は，tf.FixedLengthRecordReaderとtf.decode_rawを使う．decode_rawはstring vaueをTensorに変換する．

Cifar10のサンプルではこれを使っている．
cifar10_input. valueが3073 bytesで初めの1byteがラベル．残り(depath, width, height)=(3, 32, 32)が画像．

サンプルではsliceして取り出している．

*** Standard TensorFlow format

TenforFlowで推奨のProtocol Bufferを使うフォーマット．proto3 syntaxなので注意．

を使っている．ExampleはFeaturesをラップしている．Exampleをみると，下記4つが主なmessageに見える．

Feature: BytesList (repeated bytes), FloatList (repeated float), IntList (repeated int64)
Features: map of feature name to feature
FeatureList: repeated Feature
FeatureLists: map of feature name to feature list

FeatureListはsequence input用．Exampleが実際に1つのラベル付きサンプルを表現している．

書き込むときは，Example messageを書いたら，シリアアライズしてstringにして，TFRecordWriterでTFRecordsファイルにする．サンプルがわかりやすい．

読み込むときは，tf.TFRecordReaderとtf.parse_single_exampleを使う．TFRecordはこれ．読み込む場合のサンプル

初めからProtoBufを使っていれば，Javaにポートするときに非常に

Preprocessing

前処理のこと．

サンプルのdistorted_inputsでいろいろやっている．

Batching

input pipelineの最後にやる．インプットファイルキューとは別のキューを使う様．
tf.train.shuffle_batchを使う．これは，サンプルの順序をランダマイズする．なので，基本のinput pipelineは，

filesをランダマイズ，
各fileでexampleをランダマイズ
batch単位でinput data

を取り出すという感じ．

もっと並列度またはファイル間でもshufflingしたい場合は，tf.train.shuffle_batch_joinを使う． tf.train.shuffle_batchでのthread numを増やすと並列度が上がる．この場合は，1つのファイルから複数threadでサンプルを読み込む．

この辺はdisk-ioとか同じサンプルをbatchに入れるとかいれないとのトレードオフだと思われる．

tf.train.shuffle_batch*がsummryをgraphに追加するので，tensor boardでサマリーが見れるて，example queueが常に0より大きければ，十分なthread数を使っている．

Creating threads to prefetch using QueueRunner objects

次のパターンがテンプレ．sessionのwith statement contextでやってもいいと思う．

# Create the graph, etc.
init_op = tf.initialize_all_variables()

# Create a session for running operations in the Graph.
sess = tf.Session()

# Initialize the variables (like the epoch counter).
sess.run(init_op)

# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)

try:
    while not coord.should_stop():
        # Run training steps or whatever
        sess.run(train_op)

except tf.errors.OutOfRangeError:
    print 'Done training -- epoch limit reached'

finally:
    # When done, ask the threads to stop.
    coord.request_stop()

# Wait for threads to finish.
coord.join(threads)
sess.close()

ここにある絵を見ると何やっているかわかりやすい．

Filtering records or producing multiple examples per record

[batch, x, y, z]のbatchを0にするとそこはfilteringできる．
1 recordから複数のサンプルを生成したい場合は，shuffle_batch時にenqeueu_many=Trueをつける．

Sparse input data

SparseTensorの場合は

batchingする前にtf.parse_single_exampleを呼ばない．
batchingした後にtf.parse_exampleを呼ぶ．

Preloaded data

小さいデータセットに対して使うデータの食わせ方．２通りあって，

データをConstantに入れる
データをVariableに入れる

Constantに入れるのは，簡単だけど，たまに多重化されるらしいので，使わない方がいいらしい．

Variableに入れるのは，こんな感じ

training_data = ...
training_labels = ...
with tf.Session() as sess:
  data_initializer = tf.placeholder(dtype=training_data.dtype,
                                    shape=training_data.shape)
  label_initializer = tf.placeholder(dtype=training_labels.dtype,
                                     shape=training_labels.shape)
  input_data = tf.Variable(data_initalizer, trainable=False, collections=[])
  input_labels = tf.Variable(label_initalizer, trainable=False, collections=[])
  ...
  sess.run(input_data.initializer,
           feed_dict={data_initializer: training_data})
  sess.run(input_labels.initializer,
           feed_dict={label_initializer: training_lables})

ポイントは，

trainable=False
- GraphKeys.TRAINABLE_VARIABLES collectionに，この変数を入れないようにして，学習時に変数をupdateさせないようにする．
collections=[]
- GraphKeys.VARIABLES collectionに変数を入れないようにして，saving/restoring checkpointで，変数を無視する．

多分batch trainingしたいときに使うのだろう．多分あまり使わないので，とりあえず無視でいいと思う．

Multiple input pipelines

trainしながらevalするときにこうやったら良いという話．

traninig時に，チェックポイントを吐き出す
evaluation時に，チェックポイントを読みだして，inferenceする．

同じグラフ，同じプロセスでtraining/evaluationも可能で，それらの間でVariableの共有もできる．

Cifar10がサンプル．

所感

結構面倒と思っていたCoordinator/QueueRunnerのinput pipelineを使ったほうがいいかも．どうせ，Feedingでも複数threadでinputデータを先読みするコードを書く羽目になると思う．プロダクションに入れるとかなったら，Protocol Bufferでモデルパラメータを扱えたほうが便利なので，初めからTensorFlow formatを選択したほうがいいのかも．

Threading and Queues

Tensorflowにおいてqueueは状態を持つ(Variable)のようなノード．一般的なノードは，queueに対して，enqueue/dequeuができる．

Queue Use Overview

FIFOQueue
RandomShuffleQueue

がインプリされている．

Session objectはmultithreadedなので，他のスレッドも同じSessionを使えて，opsをパラで実行できる．

TFでは2つのクラスを公開している

tf.Coordinator
- multiple threadsを一緒に止める
- threadが止まるまで待つような例外をプログラムに送る
tf.QueueRunner:
- threadsをつくる
- threadsが協調して同じqueueにtensorをenqueueする

基本は一緒に使う．

Coordinator

手法なメソッドは3つ

shoud_stop(): threadが止まるべきならTrue
request_stop(): should stopのリクエストを送る
join(): 指定されたthreadsが止まるまで待つ

こんな感じがテンプレ的な使い方

# Thread body: loop until the coordinator indicates a stop was requested.
# If some condition becomes true, ask the coordinator to stop.
def MyLoop(coord):
  while not coord.should_stop():
    ...do something...
    if ...some condition...:
      coord.request_stop()

# Main code: create a coordinator.
coord = Coordinator()

# Create 10 threads that run 'MyLoop()'
threads = [threading.Thread(target=MyLoop, args=(coord)) for i in xrange(10)]

# Start the threads and wait for all of them to stop.
for t in threads: t.start()
coord.join(threads)

QueueRunner

enqueue opをするthreadsを作る．coordinatorに例外が報告された時にqueueを閉じてくれるthreadも作る．

tf.train.start_queue_runnersを使うテンプレよりもrawなサンプル

enqueue_opとtrain_opを作る

example = ...ops to create one example...

# Create a queue, and an op that enqueues examples one at a time in the queue.
queue = tf.RandomShuffleQueue(...)
enqueue_op = queue.enqueue(example)

# Create a training graph that starts by dequeuing a batch of examples.
inputs = queue.dequeue_many(batch_size)
train_op = ...use 'inputs' to build the training part of the graph...

QueueRunner/Coordinatorをつくて，threadsを起動

# Create a queue runner that will run 4 threads in parallel to enqueue examples.
qr = tf.train.QueueRunner(queue, [enqueue_op] * 4)

# Launch the graph.
sess = tf.Session()

# Create a coordinator, launch the queue runner threads.
coord = tf.train.Coordinator()
enqueue_threads = qr.create_threads(sess, coord=coord, start=True)

# Run the training loop, controlling termination with the coordinator.

try: 
  for step in xrange(1000000):
      if coord.should_stop():
          break
      sess.run(train_op)
   
execpt Exception e:
      coord.request_stop()

# Terminate as usual.  It is innocuous to request stop twice.
coord.request_stop()

# And wait for them to actually do it.
coord.join(enqueue_threads)

Adding a New Op

これは別途．

Custom Data Readers

これは別途．

Using GPUs

Supported devices

で実行すると，a, bはcpuで実行されて，cは(あれば)gpuで実行される．

Logging Device placement

opとtensorがどのdeviceに割り当てられているかを知りたいときは，log_device_placement=Trueにする．

こんな感じ

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

Manual device placement

device contextを使う．

import tensorflow as tf

# Creates a graph.
with tf.device('/cpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print sess.run(c)

で実行すると，a, bはcpuで実行されて，cはgpuで実行される．

Using a single GPU on a multi-GPU system

基本は，一番小さいindexのgpuで実行されるが，指定も可能

# Creates a graph.
with tf.device('/gpu:2'):
    ...

指定したdeviceが存在しない場合は，InvalidArgumentErrorが発生するが，allow_soft_placement=Trueを指定すると，勝手に選んでくれる(多分一番小さいindexのgpu, 空いているgpuでないと思われる: 未検証)．

# Creates a session with allow_soft_placement and log_device_placement set
# to True.
sess = tf.Session(config=tf.ConfigProto(
     allow_soft_placement=True, log_device_placement=True))
     ...

Using multiple GPUs

multi-tower fashionでやる．それぞれのtowerが別のGPUに割り当てられる．

Cifar10のサンプルがいい例．

Sharing Variables

複雑なモデルを組むときには，多くのVarialbe setをシェアする必要がよくあるし，それらを全部，一緒にinitializeしたくなる．そういう場合に使うのが，

tf.variable_scope()
tf.get_variable()

ここはその説明．

こんなモデル, (Conv+ReLue) * 2 filterがあったとき

def my_image_filter(input_images):
    conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
        name="conv1_weights")
    conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")
    conv1 = tf.nn.conv2d(input_images, conv1_weights,
        strides=[1, 1, 1, 1], padding='SAME')
    relu1 = tf.nn.relu(conv1 + conv1_biases)

    conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
        name="conv2_weights")
    conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")
    conv2 = tf.nn.conv2d(relu1, conv2_weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv2 + conv2_biases)

このモデルを使いたいが，2つimage1, imave2をこのモデル（関数）に食わせる場合

# First call creates one set of variables.
result1 = my_image_filter(image1)
# Another set is created in the second call.
result2 = my_image_filter(image2)

と呼ぶと，変数が2重に作らてしまう．

変数を外だしするのが一般的なやりかたと思が，例えば，"map of name to variable"のdictionaryを作ると，

graphを作るコードで，name, type, shapeをドキュメント化しないとならない
コードが変わると，それを呼ぶ人は多かれ少なかれ別のVarialbeを作らないとイケないかもしれない

この問題に対処するには，クラス化して，クラスの中で，Variableを気をつけて扱わう．しかし，もっと簡単な方法をTFでは提供している．それが，Variable Scope.

Variable Scope Example

Varialbe Scopeで使う主なメソッドは，2つで主な役割は

tf.variable_scope(, , ): nameのVariableを作る
tf.get_variable(): tf.variable_scopeに与えられたnameのnamespaceを提供

initializerはいくつか提供されていて，例えば

tf.constant_initializer(value)
tf.random_uniform_initializer(a, b)
tf.random_normal_initializer(mean, stddev)

機能は名前の通り．

前述のコードをtf.get_varialbeとtf.varialbe_scopeで書き直すと，

conv_relu

def conv_relu(input, kernel_shape, bias_shape):
    # Create variable named "weights".
    weights = tf.get_variable("weights", kernel_shape,
        initializer=tf.random_normal_initializer(0, 1))
    # Create variable named "biases".
    biases = tf.get_variable("biases", bias_shape,
        initializer=tf.constant_initializer(0.0))
    conv = tf.nn.conv2d(input, weights,
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv + biases)

method1を2回呼びたいので，variale_scopeで別の名前空間をつければ問題ない．

my_image_filter

def my_image_filter(input_images):
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases".
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases".
        return conv_relu(relu1, [5, 5, 32, 32], [32])

次に，my_image_filterを2回呼ぶと，

result1 = my_image_filter(image1)
result2 = my_image_filter(image2)
# Raises ValueError(... conv1/weights already exists ...)

ValueErrorがでるので，もう一度 variable_scopeを使って，その中でscope.reuse_variableを使う．

with tf.variable_scope("image_filters") as scope:
    result1 = my_image_filter(image1)
    scope.reuse_variables()
   result2 = my_image_filter(image2)

variable_scope.py

How Does Variable Scope Work?

Understanding tf.get_variable()

tf.get_variable()の挙動の理解

1. reuse == False (default)

{scope name}/{varialbe name}でVariableを作る
{scope name}/{varialbe name}が既に存在するか調べてあったら，ValueError

2. reuse == True

{scope name}/{varialbe name}が既に存在するか調べる
なかったら，ValueError
あったら，既に存在するVariableを返す．

Basics of tf.variable_scope()

varialbe_scopeはnested可能．
- "{scope name1}/{scope name2}/{variable name}"みないな名前になる
現在のvairalbe scopeが取得可能
- tf.get_variable_scope()
- tf.get_variable_scope().reuse_varialbes()でreuse=Trueになる
- reuse = Falseにはできないので注意．Trueが優先される
- 例えば，第3者がreuse=Trueとしたときに，コード書いた人が関数の中で，resue=Falseとやってしまうと，第3者は思っても見ない挙動を関数がしたとなるから
reuseは継承する
- resue=Trueのvarialbe scopeの中でまたvariable scopeを開くとそのスコープもreuse=True

Capturing variable scope

tf.varialbe_scope(name)のnameには他のVariableScopeを渡すことも可能．
あるscopeの中で，scopeをnestし，VariableScopeを渡すと，上位のscopeのname prefixはつかない．

# Jump out of the current variable scope when passing VariableScope
with tf.variable_scope("foo") as foo_scope:
    assert foo_scope.name == "foo"

with tf.variable_scope("bar"):
    with tf.variable_scope("baz") as other_scope:
        assert other_scope.name == "bar/baz"
        with tf.variable_scope(foo_scope) as foo_scope2:
            assert foo_scope2.name == "foo"  # Not changed.

Initializers in variable scope

ある程度まとめてinitializeしたいときはvarialbe_scope(initializer=...)を使ってしまう．すると，scope内のVariable.initializerはこれを使う．明示的にVarialbe.initializerを使うとoverrideする．

サンプル

import tensorflow as tf


with tf.Session() as sess:


    with tf.variable_scope("foo", initializer=tf.constant_initializer(0.4)):
        v = tf.get_variable("v", [1])
        v.initializer.run()
        assert v.eval() == 0.4  # Default initializer as set above.

        w = tf.get_variable("w", [1], initializer=tf.constant_initializer(0.3))
        w.initializer.run()
        assert w.eval() == 0.3  # Specific initializer overrides the default.
     
        with tf.variable_scope("bar"):
            v = tf.get_variable("v", [1])
            v.initializer.run()
            assert v.eval() == 0.4  # Inherited default initializer.
     
        with tf.variable_scope("baz", initializer=tf.constant_initializer(0.2)):
            v = tf.get_variable("v", [1])
            v.initializer.run()
            assert v.eval() == 0.2  # Changed default initializer.

sess.close()

Names of ops in tf.variable_scope()

今まではvarialbe nameに関して議論していたが，今度はop.nameはどうなるかという話で，opもvariable_scope(name)のnameを共有する．

with tf.variable_scope("foo"):
    x = 1.0 + tf.get_variable("v", [1])
assert x.op.name == "foo/add"

name_scopeもvariable_scopeと一緒に開ける．その場合，name_scopeはop.nameにのみ影響を与える．

# op name only affected by using name_scope
with tf.variable_scope("hoo"):
    with tf.name_scope("bar"):
        v = tf.get_variable("v", [1])
        x1 = 1.0 + v
assert v.name == "hoo/v:0"
assert x1.op.name == "hoo/bar/add"

variable_scope/name_scopeをひっくり返しても同様．

Examples of Use

variable scopeを使っている例

models/image/cifar10.py
models/rnn/rnn_cell.py
models/rnn/seq2seq.py

RNNだとパラメータシェアをたくさんするのでvariable_scopeを使ったVarible.resuse=Trueをよく使う．

全体所感

KerasやChainerと比べると使えるようになるまでに面倒臭さがあると思う．機能が多い分だと思うがコスパが低い．なので，skflowとかでてる．各library masterから見たら，多分使わんかもしれない．と言ってもKeras BackendにTensorFlowを選べるが．とりあえず，Googleが出しているからみんな注目してることに変わりはないはず．この記事を書いている時点で，OSSになってからまだ，2ヶ月なので，半年後には行く末がわかるかな...．skflowみたいな使いやすくしました的なラッパーは結構でるかも．でも，演算部分はTheanoに似ている気がするし，ちゃんとhow toを読めば普通に使えると思った．