hadoop-ansible/README.md


# Hadoop cluster based on the CDH 4 packages.

This is the playbook that I used to install and configure the Hadoop cluster @CNR, based on the deb packages found in the Cloudera repositories.
No cloudera manager was used nor installed.

## The cluster.

The cluster structure is the following:

- jobtracker.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
  - mapreduce HA jobtracker
  - zookeeper quorum
  - HA HDFS journal
- quorum4.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
  - mapreduce HA jobtracker
  - zookeeper quorum
  - HA HDFS journal
- nn1.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
  - hdfs HA namenode
  - zookeeper quorum
  - HA HDFS journal
- nn2.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
  - hdfs HA namenode
  - zookeeper quorum
  - HA HDFS journal
- hbase-master.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
  - hbase primary master
  - hbase thrift 
  - zookeeper quorum
  - HA HDFS journal
- hbase-master2.t.hadoop.research-infrastructures.eu (2GB RAM, 2 CPUs):
  - HBASE secondary master 
  - hbase thrift 

- node{2..13}.t.hadoop.research-infrastructures.eu (9GB RAM, 8 CPUs, 1000GB external storage for HDFS each):
  - mapreduce tasktracker
  - hdfs datanode
  - hbase regionserver
  - solr (sharded)

- hive.t.hadoop.research-infrastructures.eu:
  - hue
  - hive
  - oozie
  - sqoop

- db.t.hadoop.research-infrastructures.eu:
  - postgresql instance for hue and hive


Su jobtracker.t.hadoop.research-infrastructures.eu sono installati gli
script che gestiscono tutti i servizi. È possibile fermare/attivare i
singoli servizi oppure tutto il cluster, rispettando l'ordine
corretto.

Hanno tutti prefisso "service-" e il nome dello script dà un'idea delle operazioni che verranno eseguite:
service-global-hadoop-cluster
service-global-hbase
service-global-hdfs
service-global-mapred
service-global-zookeeper
service-hbase-master
service-hbase-regionserver
service-hbase-rest
service-hdfs-datanode
service-hdfs-httpfs
service-hdfs-journalnode
service-hdfs-namenode
service-hdfs-secondarynamenode
service-mapreduce-jobtracker
service-mapreduce-tasktracker
service-zookeeper-server

Prendono come parametro "start,stop,status,restart"


- jobtracker URL:
http://jobtracker.t.hadoop.research-infrastructures.eu:50030/jobtracker.jsp
- HDFS URL:
http://namenode.t.hadoop.research-infrastructures.eu:50070/dfshealth.jsp
- HBASE master URL:
http://hbase-master.t.hadoop.research-infrastructures.eu:60010/master-status
- HUE Web Interface:
http://quorum2.t.hadoop.research-infrastructures.eu:8888

- URL ganglia, per le metriche del cluster:
http://monitoring.research-infrastructures.eu/ganglia/?r=hour&cs=&ce=&s=by+name&c=Openaire%252B%2520Hadoop%2520TEST&tab=m&vn=
- URL Nagios, per lo stato dei servizi (da attivare):
http://monitoring.research-infrastructures.eu/nagios3/
------------------------------------------------------------------------------------------------
dom0/nodes/san map		data

dlib18x:       *node8*	e90.6 (dlibsan9)
dlib19x:       *node9*	e90.7 (dlibsan9)
dlib20x:       *node10*	e90.8 (dlibsan9)
dlib22x:       *node11*	e90.5 (dlibsan9)
	       *node7*	e63.4 (dlibsan6)
dlib23x:       *node12*	e80.3 (dlibsan8)
	       *node13*	e80.4 (dlibsan8)
dlib24x:       *node2*	e25.1 (dlibsan2)
	       *node3*	e74.1 (dlibsan7)
dlib25x:       *node4*	e83.4 (dlibsan8)
dlib26x:       *node5*	e72.1 (dlibsan7)
       	       *node6*	e63.3 (dlibsan6)

------------------------------------------------------------------------------------------------
Submitting a job (supporting multiple users)
To support multiple users you create UNIX user accounts only in the master node.

Sul namenode:

#groupadd supergroup
(da eseguire una sola volta)

#adduser claudio
...

# su - hdfs
$ hadoop dfs -mkdir /home/claudio
$ hadoop dfs -chown -R claudio:supergroup /home/claudio

(aggiungere claudio al gruppo supergroup)


  Important:

If you do not create /tmp properly, with the right permissions as shown below, you may have problems with CDH components later. Specifically, if you don't create /tmp yourself, another process may create it automatically with restrictive permissions that will prevent your other applications from using it.

Create the /tmp directory after HDFS is up and running, and set its permissions to 1777 (drwxrwxrwt), as follows:

$ sudo -u hdfs hadoop fs -mkdir /tmp
$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp

  Note:

If Kerberos is enabled, do not use commands in the form sudo -u <user> <command>; they will fail with a security error. Instead, use the following commands: $ kinit <user> (if you are using a password) or $ kinit -kt <keytab> <principal> (if you are using a keytab) and then, for each command executed by this user, $ <command>
Step 8: Create MapReduce /var directories

sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred

Step 9: Verify the HDFS File Structure

$ sudo -u hdfs hadoop fs -ls -R /

You should see:

drwxrwxrwt   - hdfs supergroup          0 2012-04-19 15:14 /tmp
drwxr-xr-x   - hdfs     supergroup          0 2012-04-19 15:16 /var
drwxr-xr-x   - hdfs     supergroup          0 2012-04-19 15:16 /var/lib
drwxr-xr-x   - hdfs     supergroup          0 2012-04-19 15:16 /var/lib/hadoop-hdfs
drwxr-xr-x   - hdfs     supergroup          0 2012-04-19 15:16 /var/lib/hadoop-hdfs/cache
drwxr-xr-x   - mapred   supergroup          0 2012-04-19 15:19 /var/lib/hadoop-hdfs/cache/mapred
drwxr-xr-x   - mapred   supergroup          0 2012-04-19 15:29 /var/lib/hadoop-hdfs/cache/mapred/mapred
drwxrwxrwt   - mapred   supergroup          0 2012-04-19 15:33 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

Step 10: Create and Configure the mapred.system.dir Directory in HDFS

After you start HDFS and create /tmp, but before you start the JobTracker (see the next step), you must also create the HDFS directory specified by the mapred.system.dir parameter (by default ${hadoop.tmp.dir}/mapred/system and configure it to be owned by the mapred user.

To create the directory in its default location:

$ sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system
$ sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/mapred/system

  Important:

If you create the mapred.system.dir directory in a different location, specify that path in the conf/mapred-site.xml file.

When starting up, MapReduce sets the permissions for the mapred.system.dir directory to drwx------, assuming the user mapred owns that directory.
Step 11: Start MapReduce

To start MapReduce, start the TaskTracker and JobTracker services

On each TaskTracker system:

$ sudo service hadoop-0.20-mapreduce-tasktracker start

On the JobTracker system:

$ sudo service hadoop-0.20-mapreduce-jobtracker start

Step 12: Create a Home Directory for each MapReduce User

Create a home directory for each MapReduce user. It is best to do this on the NameNode; for example:

$ sudo -u hdfs hadoop fs -mkdir  /user/<user>
$ sudo -u hdfs hadoop fs -chown <user> /user/<user>

where <user> is the Linux username of each user.

Alternatively, you can log in as each Linux user (or write a script to do so) and create the home directory as follows:

sudo -u hdfs hadoop fs -mkdir /user/$USER
sudo -u hdfs hadoop fs -chown $USER /user/$USER


------------------------------------------------------------------------------------------------
We use the jobtracker as provisioning server
Correct start order (reverse to obtain the stop order):
• HDFS (NB: substitute secondarynamenode with journalnode when we will have HA)
• MapReduce
• Zookeeper
• HBase
• Hive Metastore
• Hue
• Oozie
• Ganglia
• Nagios


I comandi di init si trovano nel file "init.sh" nel repository ansible.


Errore da indagare:
http://stackoverflow.com/questions/6153560/hbase-client-connectionloss-for-hbase-error

# GC hints
http://stackoverflow.com/questions/9792590/gc-tuning-preventing-a-full-gc?rq=1

HBASE troubleshooting

- Se alcune region rimangono in "transition" indefinitamente, è possibile provare a risolvere il problema da shell:

# su - hbase
$ hbase hbck -fixAssignments

Potrebbe essere utile anche
$ hbase hbck -repairHoles

-----------------------------------------------------
Quando si verifica: "ROOT stuck in assigning forever"

bisogna:
- verificare che non ci siano errori relativi a zookeeper. Se ci sono, far ripartire zookeeper e poi tutto il cluster hbase
- Far ripartire il solo hbase master
-----------------------------------------------------
Quando ci sono tabelle disabilitate, ma che risultano impossibili da abilitare o eliminare:
# su - hbase
$ hbase hbck -fixAssignments

* Restart del master hbase

-----------------------------------------------------
Vedi: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/32838
Ed in generale, per capire il funzionamento: http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/

Tool per il monitoraggio di hbase quando è configurato per il manual splitting:
https://github.com/sentric/hannibal

---------------------------------------------------------------------------------

2013-02-22 10:24:46,492 INFO org.apache.hadoop.mapred.TaskTracker:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@41a7fead
2013-02-22 10:24:46,492 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_node2.t.hadoop.research-infrastructures.eu:localhost/127.0.0.1:47798
2013-02-22 10:24:46,492 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1 and reserved physical memory is not configured. TaskMemoryManager is disabled.
2013-02-22 10:24:46,571 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760

---
Post interessante che tratta la configurazione ed i vari parametri: http://gbif.blogspot.it/2011/01/setting-up-hadoop-cluster-part-1-manual.html


Lista di nomi di parametri deprecati e il loro nuovo nome: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/DeprecatedProperties.html

---
How to decommission a worker node

1. If they are many, reduce the hdfs redundancy factor
2. Stop the regionserver on the node
3. Add the node to the hdfs and jobtracker exclude list

./run.sh mapred.yml -i inventory/hosts.production -l jt_masters --tags=hadoop_workers
./run.sh hadoop-hdfs.yml -i inventory/hosts.production -l hdfs_masters --tags=hadoop_workers

4. Refresh the hdfs and jobtracker configuration

hdfs dfsadmin -refreshNodes
mapred mradmin -refreshNodes

5. Remove the node from the list of allowed ones

5a. Edit the inventory

5b. Run
./run.sh hadoop-common.yml -i inventory/hosts.production --tags=hadoop_workers
./run.sh mapred.yml -i inventory/hosts.production -l jt_masters --tags=hadoop_workers
./run.sh hadoop-hdfs.yml -i inventory/hosts.production -l hdfs_masters --tags=hadoop_workers

---------------------------------------------------------------------------------
Nagios monitoring

- The handlers to restart the services are managed via nrpe. To get them work, we need to:
  - Add an entry in nrpe.cfg. The command name needs to start with "global_restart_" and
    the remaining part of the name must coincide with the name of the service.
    For example:
    command[global_restart_hadoop-0.20-mapreduce-tasktracker]=/usr/bin/sudo /usr/sbin/service hadoop-0.20-mapreduce-tasktracker restart
  - Add a handler to the nagios service. The command needs the service name as parameter
    Example:
    event_handler			restart-service!hadoop-0.20-mapreduce-tasktracker

---------------------------------------------------------------------------------
Initial commit 2022-09-12 11:37:16 +02:00
The old playbook files. Some are missing. 2022-09-12 11:54:14 +02:00			`# Hadoop cluster based on the CDH 4 packages.`

			`This is the playbook that I used to install and configure the Hadoop cluster @CNR, based on the deb packages found in the Cloudera repositories.`
			`No cloudera manager was used nor installed.`

			`## The cluster.`

			`The cluster structure is the following:`

			`- jobtracker.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):`
			`- mapreduce HA jobtracker`
			`- zookeeper quorum`
			`- HA HDFS journal`
			`- quorum4.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):`
			`- mapreduce HA jobtracker`
			`- zookeeper quorum`
			`- HA HDFS journal`
			`- nn1.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):`
			`- hdfs HA namenode`
			`- zookeeper quorum`
			`- HA HDFS journal`
			`- nn2.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):`
			`- hdfs HA namenode`
			`- zookeeper quorum`
			`- HA HDFS journal`
			`- hbase-master.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):`
			`- hbase primary master`
			`- hbase thrift`
			`- zookeeper quorum`
			`- HA HDFS journal`
			`- hbase-master2.t.hadoop.research-infrastructures.eu (2GB RAM, 2 CPUs):`
			`- HBASE secondary master`
			`- hbase thrift`

			`- node{2..13}.t.hadoop.research-infrastructures.eu (9GB RAM, 8 CPUs, 1000GB external storage for HDFS each):`
			`- mapreduce tasktracker`
			`- hdfs datanode`
			`- hbase regionserver`
			`- solr (sharded)`

			`- hive.t.hadoop.research-infrastructures.eu:`
			`- hue`
			`- hive`
			`- oozie`
			`- sqoop`

			`- db.t.hadoop.research-infrastructures.eu:`
			`- postgresql instance for hue and hive`


			`Su jobtracker.t.hadoop.research-infrastructures.eu sono installati gli`
			`script che gestiscono tutti i servizi. È possibile fermare/attivare i`
			`singoli servizi oppure tutto il cluster, rispettando l'ordine`
			`corretto.`

			`Hanno tutti prefisso "service-" e il nome dello script dà un'idea delle operazioni che verranno eseguite:`
			`service-global-hadoop-cluster`
			`service-global-hbase`
			`service-global-hdfs`
			`service-global-mapred`
			`service-global-zookeeper`
			`service-hbase-master`
			`service-hbase-regionserver`
			`service-hbase-rest`
			`service-hdfs-datanode`
			`service-hdfs-httpfs`
			`service-hdfs-journalnode`
			`service-hdfs-namenode`
			`service-hdfs-secondarynamenode`
			`service-mapreduce-jobtracker`
			`service-mapreduce-tasktracker`
			`service-zookeeper-server`

			`Prendono come parametro "start,stop,status,restart"`


			`- jobtracker URL:`
			`http://jobtracker.t.hadoop.research-infrastructures.eu:50030/jobtracker.jsp`
			`- HDFS URL:`
			`http://namenode.t.hadoop.research-infrastructures.eu:50070/dfshealth.jsp`
			`- HBASE master URL:`
			`http://hbase-master.t.hadoop.research-infrastructures.eu:60010/master-status`
			`- HUE Web Interface:`
			`http://quorum2.t.hadoop.research-infrastructures.eu:8888`

			`- URL ganglia, per le metriche del cluster:`
			`http://monitoring.research-infrastructures.eu/ganglia/?r=hour&cs=&ce=&s=by+name&c=Openaire%252B%2520Hadoop%2520TEST&tab=m&vn=`
			`- URL Nagios, per lo stato dei servizi (da attivare):`
			`http://monitoring.research-infrastructures.eu/nagios3/`
			`------------------------------------------------------------------------------------------------`
			`dom0/nodes/san map data`

			`dlib18x: node8 e90.6 (dlibsan9)`
			`dlib19x: node9 e90.7 (dlibsan9)`
			`dlib20x: node10 e90.8 (dlibsan9)`
			`dlib22x: node11 e90.5 (dlibsan9)`
			`node7 e63.4 (dlibsan6)`
			`dlib23x: node12 e80.3 (dlibsan8)`
			`node13 e80.4 (dlibsan8)`
			`dlib24x: node2 e25.1 (dlibsan2)`
			`node3 e74.1 (dlibsan7)`
			`dlib25x: node4 e83.4 (dlibsan8)`
			`dlib26x: node5 e72.1 (dlibsan7)`
			`node6 e63.3 (dlibsan6)`

			`------------------------------------------------------------------------------------------------`
			`Submitting a job (supporting multiple users)`
			`To support multiple users you create UNIX user accounts only in the master node.`

			`Sul namenode:`

			`#groupadd supergroup`
			`(da eseguire una sola volta)`

			`#adduser claudio`
			`...`

			`# su - hdfs`
			`$ hadoop dfs -mkdir /home/claudio`
			`$ hadoop dfs -chown -R claudio:supergroup /home/claudio`

			`(aggiungere claudio al gruppo supergroup)`


			`Important:`

			`If you do not create /tmp properly, with the right permissions as shown below, you may have problems with CDH components later. Specifically, if you don't create /tmp yourself, another process may create it automatically with restrictive permissions that will prevent your other applications from using it.`

			`Create the /tmp directory after HDFS is up and running, and set its permissions to 1777 (drwxrwxrwt), as follows:`

			`$ sudo -u hdfs hadoop fs -mkdir /tmp`
			`$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp`

			`Note:`

			`If Kerberos is enabled, do not use commands in the form sudo -u <user> <command>; they will fail with a security error. Instead, use the following commands: $ kinit <user> (if you are using a password) or $ kinit -kt <keytab> <principal> (if you are using a keytab) and then, for each command executed by this user, $ <command>`
			`Step 8: Create MapReduce /var directories`

			`sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging`
			`sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging`
			`sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred`

			`Step 9: Verify the HDFS File Structure`

			`$ sudo -u hdfs hadoop fs -ls -R /`

			`You should see:`

			`drwxrwxrwt - hdfs supergroup 0 2012-04-19 15:14 /tmp`
			`drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var`
			`drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var/lib`
			`drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var/lib/hadoop-hdfs`
			`drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var/lib/hadoop-hdfs/cache`
			`drwxr-xr-x - mapred supergroup 0 2012-04-19 15:19 /var/lib/hadoop-hdfs/cache/mapred`
			`drwxr-xr-x - mapred supergroup 0 2012-04-19 15:29 /var/lib/hadoop-hdfs/cache/mapred/mapred`
			`drwxrwxrwt - mapred supergroup 0 2012-04-19 15:33 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging`

			`Step 10: Create and Configure the mapred.system.dir Directory in HDFS`

			`After you start HDFS and create /tmp, but before you start the JobTracker (see the next step), you must also create the HDFS directory specified by the mapred.system.dir parameter (by default ${hadoop.tmp.dir}/mapred/system and configure it to be owned by the mapred user.`

			`To create the directory in its default location:`

			`$ sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system`
			`$ sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/mapred/system`

			`Important:`

			`If you create the mapred.system.dir directory in a different location, specify that path in the conf/mapred-site.xml file.`

			`When starting up, MapReduce sets the permissions for the mapred.system.dir directory to drwx------, assuming the user mapred owns that directory.`
			`Step 11: Start MapReduce`

			`To start MapReduce, start the TaskTracker and JobTracker services`

			`On each TaskTracker system:`

			`$ sudo service hadoop-0.20-mapreduce-tasktracker start`

			`On the JobTracker system:`

			`$ sudo service hadoop-0.20-mapreduce-jobtracker start`

			`Step 12: Create a Home Directory for each MapReduce User`

			`Create a home directory for each MapReduce user. It is best to do this on the NameNode; for example:`

			`$ sudo -u hdfs hadoop fs -mkdir /user/<user>`
			`$ sudo -u hdfs hadoop fs -chown <user> /user/<user>`

			`where <user> is the Linux username of each user.`

			`Alternatively, you can log in as each Linux user (or write a script to do so) and create the home directory as follows:`

			`sudo -u hdfs hadoop fs -mkdir /user/$USER`
			`sudo -u hdfs hadoop fs -chown $USER /user/$USER`



			`------------------------------------------------------------------------------------------------`
			`We use the jobtracker as provisioning server`
			`Correct start order (reverse to obtain the stop order):`
			`• HDFS (NB: substitute secondarynamenode with journalnode when we will have HA)`
			`• MapReduce`
			`• Zookeeper`
			`• HBase`
			`• Hive Metastore`
			`• Hue`
			`• Oozie`
			`• Ganglia`
			`• Nagios`


			`I comandi di init si trovano nel file "init.sh" nel repository ansible.`


			`Errore da indagare:`
			`http://stackoverflow.com/questions/6153560/hbase-client-connectionloss-for-hbase-error`

			`# GC hints`
			`http://stackoverflow.com/questions/9792590/gc-tuning-preventing-a-full-gc?rq=1`

			`HBASE troubleshooting`

			`- Se alcune region rimangono in "transition" indefinitamente, è possibile provare a risolvere il problema da shell:`

			`# su - hbase`
			`$ hbase hbck -fixAssignments`

			`Potrebbe essere utile anche`
			`$ hbase hbck -repairHoles`

			`-----------------------------------------------------`
			`Quando si verifica: "ROOT stuck in assigning forever"`

			`bisogna:`
			`- verificare che non ci siano errori relativi a zookeeper. Se ci sono, far ripartire zookeeper e poi tutto il cluster hbase`
			`- Far ripartire il solo hbase master`
			`-----------------------------------------------------`
			`Quando ci sono tabelle disabilitate, ma che risultano impossibili da abilitare o eliminare:`
			`# su - hbase`
			`$ hbase hbck -fixAssignments`

			`* Restart del master hbase`

			`-----------------------------------------------------`
			`Vedi: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/32838`
			`Ed in generale, per capire il funzionamento: http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/`

			`Tool per il monitoraggio di hbase quando è configurato per il manual splitting:`
			`https://github.com/sentric/hannibal`

			`---------------------------------------------------------------------------------`

			`2013-02-22 10:24:46,492 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@41a7fead`
			`2013-02-22 10:24:46,492 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_node2.t.hadoop.research-infrastructures.eu:localhost/127.0.0.1:47798`
			`2013-02-22 10:24:46,492 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1 and reserved physical memory is not configured. TaskMemoryManager is disabled.`
			`2013-02-22 10:24:46,571 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760`

			`---`
			`Post interessante che tratta la configurazione ed i vari parametri: http://gbif.blogspot.it/2011/01/setting-up-hadoop-cluster-part-1-manual.html`


			`Lista di nomi di parametri deprecati e il loro nuovo nome: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/DeprecatedProperties.html`

			`---`
			`How to decommission a worker node`

			`1. If they are many, reduce the hdfs redundancy factor`
			`2. Stop the regionserver on the node`
			`3. Add the node to the hdfs and jobtracker exclude list`

			`./run.sh mapred.yml -i inventory/hosts.production -l jt_masters --tags=hadoop_workers`
			`./run.sh hadoop-hdfs.yml -i inventory/hosts.production -l hdfs_masters --tags=hadoop_workers`

			`4. Refresh the hdfs and jobtracker configuration`

			`hdfs dfsadmin -refreshNodes`
			`mapred mradmin -refreshNodes`

			`5. Remove the node from the list of allowed ones`

			`5a. Edit the inventory`

			`5b. Run`
			`./run.sh hadoop-common.yml -i inventory/hosts.production --tags=hadoop_workers`
			`./run.sh mapred.yml -i inventory/hosts.production -l jt_masters --tags=hadoop_workers`
			`./run.sh hadoop-hdfs.yml -i inventory/hosts.production -l hdfs_masters --tags=hadoop_workers`

			`---------------------------------------------------------------------------------`
			`Nagios monitoring`

			`- The handlers to restart the services are managed via nrpe. To get them work, we need to:`
			`- Add an entry in nrpe.cfg. The command name needs to start with "global_restart_" and`
			`the remaining part of the name must coincide with the name of the service.`
			`For example:`
			`command[global_restart_hadoop-0.20-mapreduce-tasktracker]=/usr/bin/sudo /usr/sbin/service hadoop-0.20-mapreduce-tasktracker restart`
			`- Add a handler to the nagios service. The command needs the service name as parameter`
			`Example:`
			`event_handler restart-service!hadoop-0.20-mapreduce-tasktracker`

			`---------------------------------------------------------------------------------`