KZKY memo

自分用メモ.

2014-11-22から1日間の記事一覧

Spark RDD (en)

RDD (Resilient Distributed Dataset) I have investigated RDD which is the core technology on Spark and eventually found that the RDD papers are the most usefull source to understand. Matei Zaharia et al. "Resilient Distributed Datasets: A F…

Hadoop Cluster Provisioning (en)

HDD Use JBOD (Just a Bunch of Disk ) as an architecture using multiple hard drives Do not use RAID For a master node, it is possible to use RAID 1+0 for durability Better to use the number of HDD which is at leat grater than or equal to th…