33 lines
1.2 KiB
Plaintext
33 lines
1.2 KiB
Plaintext
|
|
||
|
httpfs: We need to install it on only one machine (two for redundancy). Let's use the namenodes.
|
||
|
|
||
|
Move the second jobtracker on a dedicated machine.
|
||
|
|
||
|
hbase thrift: let's have two of them, on the nodes that run the hbase masters
|
||
|
|
||
|
Impala: needs to be installed on all the datanodes. After that, hue-impala can be installed on the hue server
|
||
|
|
||
|
NB: /etc/zookeeper/conf/zoo.cfg needs to be distributed on all
|
||
|
datanodes.
|
||
|
|
||
|
Create the new disks: lvcreate -l 238465 -n node11.t.hadoop.research-infrastructures.eu-data-hdfs dlibsan6 /dev/md3
|
||
|
# Move the data:
|
||
|
rsync -qaxvH --delete --numeric-ids /mnt/disk/ dlibsan7:/mnt/disk/
|
||
|
|
||
|
----------
|
||
|
dfs.socket.timeout, for read timeout
|
||
|
dfs.datanode.socket.write.timeout, for write timeout
|
||
|
|
||
|
In fact, the read timeout value is used for various connections in
|
||
|
DFSClient, if you only increase dfs.datanode.socket.write.timeout, the
|
||
|
timeout can continue to happen.
|
||
|
|
||
|
I tried to generate 1TB data with teragen across more than 40 data
|
||
|
nodes, increasing writing timeout has not fixed the problem. When I
|
||
|
increased both values above 600000, it disappeared.
|
||
|
----------
|
||
|
|
||
|
|
||
|
To configure yarn:
|
||
|
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.1/bk_installing_manually_book/content/rpm-chap1-11.html
|