The old playbook files. Some are missing.

hadoop-cdh4-legacy
Andrea Dell'Amico 2 years ago
parent 53fcee684d
commit b821ee815b
Signed by: andrea.dellamico
GPG Key ID: 147ABE6CEB9E20FF

@ -0,0 +1,86 @@
2013-11-06 Andrea Dell'Amico <adellam@sevenseas.org>
* templates/hadoop-nrpe.cfg.j2: Add the correct entry for the tasktracker event handler.
2013-11-04 Andrea Dell'Amico <adellam@sevenseas.org>
* templates/iptables-rules.v4.j2: Add access to the hbase master rpc port from the isti network.
2013-10-10 Andrea Dell'Amico <adellam@sevenseas.org>
* templates/hbase-site.j2: Add the property hbase.master.loadbalance.bytable to try a better balance for our workload. Reference here: http://answers.mapr.com/questions/7049/table-only-on-single-region-server
2013-10-09 Andrea Dell'Amico <adellam@sevenseas.org>
* templates/nagios-server/hadoop-cluster/services.cfg.j2: Handler to restart the tasktracker when it fails.
* templates/iptables-rules.*.j2: iptables rules to block access to the services ports from outside CNR.
2013-10-07 Andrea Dell'Amico <adellam@sevenseas.org>
* templates/nagios-server: added checks for the logstash host and services
* logstash.yml: Add logstash with remote syslog to aggregate all the workers logs. Needed for solr.
2013-10-01 Andrea Dell'Amico <adellam@sevenseas.org>
* templates/: management portal that redirects to the services web interfaces.
2013-09-23 Andrea Dell'Amico <adellam@sevenseas.org>
* tasks/jobtracker-ha.yml: HA configuration for the jobtracker. jobtracker.t.hadoop and quorum4.t.hadoop are the two masters.
2013-09-19 Andrea Dell'Amico <adellam@sevenseas.org>
* all.yml: HDFS is now HA. All the datanodes lists are generated from the hosts file and are not static anymore. Changed nagios to reflect the new configuration.
2013-09-17 Andrea Dell'Amico <adellam@sevenseas.org>
* templates: Changed the system-* scripts to manage the second namenode instance. Removed the secondary namenode start/stop script
2013-09-12 Andrea Dell'Amico <adellam@sevenseas.org>
* hadoop-test.yml: New quorum4.t.hadoop node. Zookeeper now has 5
quorum nodes. HBASE master HA. quorum4 is the other instance.
2013-07-29 Andrea Dell'Amico <adellam@sevenseas.org>
* templates/datanode-hdfs-site.j2: Added "support_append" as "true" and max_xcievers as 1024
2013-06-20 Andrea Dell'Amico <adellam@sevenseas.org>
* hadoop-ganglia.yml: The ganglia configuration is now differentiated between datanodes, jobtracker, hbase master, hdfs namenode, hdfs secondary namenode
2013-02-27 Andrea Dell'Amico <adellam@sevenseas.org>
* vars/hadoop-global-vars.yml: mapred_tasktracker_reduce_tasks_maximum: 5, hbase_regionserver_heap_size: 3192
2013-02-22 Andrea Dell'Amico <adellam@isti.cnr.it>
* init.sh: Create hdfs directory /jobtracker to store the jobtracker history
* templates/mapred-site-jobtracker.j2: Activate permanent jobtracker history
* jobtracker.yml: Cleanup
2013-02-18 Andrea Dell'Amico <adellam@isti.cnr.it>
* vars/hadoop-global-vars.yml: mapred_child_java_opts: "-Xmx3092M", mapred_map_child_java_opts: "-Xmx2048M", mapred_reduce_child_java_opts: "-Xmx1512M", hbase_regionserver_heap_size: 4092
2013-02-18 Andrea Dell'Amico <adellam@isti.cnr.it>
* vars/hadoop-global-vars.yml: hbase_master_heap_size: 5120, hbase_regionserver_heap_size: 3192
2013-02-18 Andrea Dell'Amico <adellam@isti.cnr.it>
* vars/hadoop-global-vars.yml (hbase_thrift_heap_size): mapred_child_java_opts: "-Xmx1512M", mapred_map_child_java_opts: "-Xmx3092M", mapred_reduce_child_java_opts: "-Xmx2048M", hbase_master_heap_size: 3072
2013-02-16 Andrea Dell'Amico <adellam@isti.cnr.it>
* templates/hbase-thrift-env.sh.j2: Disabled the jmx console for hbase thrift
* templates/hbase-master-env.sh.j2: disabled the master jmx console
* vars/hadoop-global-vars.yml: zookeeper_max_timeout: 240000, fixed the zookeeper quorum host naming
2013-02-16 Andrea Dell'Amico <adellam@isti.cnr.it>
* vars/hadoop-global-vars.yml: mapred_child_java_opts: "-Xmx2G", mapred_reduce_child_java_opts: "-Xmx2512M", hbase_regionserver_heap_size: 5000

@ -1,3 +1,303 @@
# hadoop-ansible
Ansible playbook that installs and configures a Hadoop cluster.
# Hadoop cluster based on the CDH 4 packages.
This is the playbook that I used to install and configure the Hadoop cluster @CNR, based on the deb packages found in the Cloudera repositories.
No cloudera manager was used nor installed.
## The cluster.
The cluster structure is the following:
- jobtracker.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
- mapreduce HA jobtracker
- zookeeper quorum
- HA HDFS journal
- quorum4.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
- mapreduce HA jobtracker
- zookeeper quorum
- HA HDFS journal
- nn1.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
- hdfs HA namenode
- zookeeper quorum
- HA HDFS journal
- nn2.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
- hdfs HA namenode
- zookeeper quorum
- HA HDFS journal
- hbase-master.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
- hbase primary master
- hbase thrift
- zookeeper quorum
- HA HDFS journal
- hbase-master2.t.hadoop.research-infrastructures.eu (2GB RAM, 2 CPUs):
- HBASE secondary master
- hbase thrift
- node{2..13}.t.hadoop.research-infrastructures.eu (9GB RAM, 8 CPUs, 1000GB external storage for HDFS each):
- mapreduce tasktracker
- hdfs datanode
- hbase regionserver
- solr (sharded)
- hive.t.hadoop.research-infrastructures.eu:
- hue
- hive
- oozie
- sqoop
- db.t.hadoop.research-infrastructures.eu:
- postgresql instance for hue and hive
Su jobtracker.t.hadoop.research-infrastructures.eu sono installati gli
script che gestiscono tutti i servizi. È possibile fermare/attivare i
singoli servizi oppure tutto il cluster, rispettando l'ordine
corretto.
Hanno tutti prefisso "service-" e il nome dello script dà un'idea delle operazioni che verranno eseguite:
service-global-hadoop-cluster
service-global-hbase
service-global-hdfs
service-global-mapred
service-global-zookeeper
service-hbase-master
service-hbase-regionserver
service-hbase-rest
service-hdfs-datanode
service-hdfs-httpfs
service-hdfs-journalnode
service-hdfs-namenode
service-hdfs-secondarynamenode
service-mapreduce-jobtracker
service-mapreduce-tasktracker
service-zookeeper-server
Prendono come parametro "start,stop,status,restart"
- jobtracker URL:
http://jobtracker.t.hadoop.research-infrastructures.eu:50030/jobtracker.jsp
- HDFS URL:
http://namenode.t.hadoop.research-infrastructures.eu:50070/dfshealth.jsp
- HBASE master URL:
http://hbase-master.t.hadoop.research-infrastructures.eu:60010/master-status
- HUE Web Interface:
http://quorum2.t.hadoop.research-infrastructures.eu:8888
- URL ganglia, per le metriche del cluster:
http://monitoring.research-infrastructures.eu/ganglia/?r=hour&cs=&ce=&s=by+name&c=Openaire%252B%2520Hadoop%2520TEST&tab=m&vn=
- URL Nagios, per lo stato dei servizi (da attivare):
http://monitoring.research-infrastructures.eu/nagios3/
------------------------------------------------------------------------------------------------
dom0/nodes/san map data
dlib18x: *node8* e90.6 (dlibsan9)
dlib19x: *node9* e90.7 (dlibsan9)
dlib20x: *node10* e90.8 (dlibsan9)
dlib22x: *node11* e90.5 (dlibsan9)
*node7* e63.4 (dlibsan6)
dlib23x: *node12* e80.3 (dlibsan8)
*node13* e80.4 (dlibsan8)
dlib24x: *node2* e25.1 (dlibsan2)
*node3* e74.1 (dlibsan7)
dlib25x: *node4* e83.4 (dlibsan8)
dlib26x: *node5* e72.1 (dlibsan7)
*node6* e63.3 (dlibsan6)
------------------------------------------------------------------------------------------------
Submitting a job (supporting multiple users)
To support multiple users you create UNIX user accounts only in the master node.
Sul namenode:
#groupadd supergroup
(da eseguire una sola volta)
#adduser claudio
...
# su - hdfs
$ hadoop dfs -mkdir /home/claudio
$ hadoop dfs -chown -R claudio:supergroup /home/claudio
(aggiungere claudio al gruppo supergroup)
Important:
If you do not create /tmp properly, with the right permissions as shown below, you may have problems with CDH components later. Specifically, if you don't create /tmp yourself, another process may create it automatically with restrictive permissions that will prevent your other applications from using it.
Create the /tmp directory after HDFS is up and running, and set its permissions to 1777 (drwxrwxrwt), as follows:
$ sudo -u hdfs hadoop fs -mkdir /tmp
$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp
Note:
If Kerberos is enabled, do not use commands in the form sudo -u <user> <command>; they will fail with a security error. Instead, use the following commands: $ kinit <user> (if you are using a password) or $ kinit -kt <keytab> <principal> (if you are using a keytab) and then, for each command executed by this user, $ <command>
Step 8: Create MapReduce /var directories
sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred
Step 9: Verify the HDFS File Structure
$ sudo -u hdfs hadoop fs -ls -R /
You should see:
drwxrwxrwt - hdfs supergroup 0 2012-04-19 15:14 /tmp
drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var
drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var/lib
drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var/lib/hadoop-hdfs
drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var/lib/hadoop-hdfs/cache
drwxr-xr-x - mapred supergroup 0 2012-04-19 15:19 /var/lib/hadoop-hdfs/cache/mapred
drwxr-xr-x - mapred supergroup 0 2012-04-19 15:29 /var/lib/hadoop-hdfs/cache/mapred/mapred
drwxrwxrwt - mapred supergroup 0 2012-04-19 15:33 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
Step 10: Create and Configure the mapred.system.dir Directory in HDFS
After you start HDFS and create /tmp, but before you start the JobTracker (see the next step), you must also create the HDFS directory specified by the mapred.system.dir parameter (by default ${hadoop.tmp.dir}/mapred/system and configure it to be owned by the mapred user.
To create the directory in its default location:
$ sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system
$ sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/mapred/system
Important:
If you create the mapred.system.dir directory in a different location, specify that path in the conf/mapred-site.xml file.
When starting up, MapReduce sets the permissions for the mapred.system.dir directory to drwx------, assuming the user mapred owns that directory.
Step 11: Start MapReduce
To start MapReduce, start the TaskTracker and JobTracker services
On each TaskTracker system:
$ sudo service hadoop-0.20-mapreduce-tasktracker start
On the JobTracker system:
$ sudo service hadoop-0.20-mapreduce-jobtracker start
Step 12: Create a Home Directory for each MapReduce User
Create a home directory for each MapReduce user. It is best to do this on the NameNode; for example:
$ sudo -u hdfs hadoop fs -mkdir /user/<user>
$ sudo -u hdfs hadoop fs -chown <user> /user/<user>
where <user> is the Linux username of each user.
Alternatively, you can log in as each Linux user (or write a script to do so) and create the home directory as follows:
sudo -u hdfs hadoop fs -mkdir /user/$USER
sudo -u hdfs hadoop fs -chown $USER /user/$USER
------------------------------------------------------------------------------------------------
We use the jobtracker as provisioning server
Correct start order (reverse to obtain the stop order):
• HDFS (NB: substitute secondarynamenode with journalnode when we will have HA)
• MapReduce
• Zookeeper
• HBase
• Hive Metastore
• Hue
• Oozie
• Ganglia
• Nagios
I comandi di init si trovano nel file "init.sh" nel repository ansible.
Errore da indagare:
http://stackoverflow.com/questions/6153560/hbase-client-connectionloss-for-hbase-error
# GC hints
http://stackoverflow.com/questions/9792590/gc-tuning-preventing-a-full-gc?rq=1
HBASE troubleshooting
- Se alcune region rimangono in "transition" indefinitamente, è possibile provare a risolvere il problema da shell:
# su - hbase
$ hbase hbck -fixAssignments
Potrebbe essere utile anche
$ hbase hbck -repairHoles
-----------------------------------------------------
Quando si verifica: "ROOT stuck in assigning forever"
bisogna:
- verificare che non ci siano errori relativi a zookeeper. Se ci sono, far ripartire zookeeper e poi tutto il cluster hbase
- Far ripartire il solo hbase master
-----------------------------------------------------
Quando ci sono tabelle disabilitate, ma che risultano impossibili da abilitare o eliminare:
# su - hbase
$ hbase hbck -fixAssignments
* Restart del master hbase
-----------------------------------------------------
Vedi: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/32838
Ed in generale, per capire il funzionamento: http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
Tool per il monitoraggio di hbase quando è configurato per il manual splitting:
https://github.com/sentric/hannibal
---------------------------------------------------------------------------------
2013-02-22 10:24:46,492 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@41a7fead
2013-02-22 10:24:46,492 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_node2.t.hadoop.research-infrastructures.eu:localhost/127.0.0.1:47798
2013-02-22 10:24:46,492 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1 and reserved physical memory is not configured. TaskMemoryManager is disabled.
2013-02-22 10:24:46,571 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
---
Post interessante che tratta la configurazione ed i vari parametri: http://gbif.blogspot.it/2011/01/setting-up-hadoop-cluster-part-1-manual.html
Lista di nomi di parametri deprecati e il loro nuovo nome: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
---
How to decommission a worker node
1. If they are many, reduce the hdfs redundancy factor
2. Stop the regionserver on the node
3. Add the node to the hdfs and jobtracker exclude list
./run.sh mapred.yml -i inventory/hosts.production -l jt_masters --tags=hadoop_workers
./run.sh hadoop-hdfs.yml -i inventory/hosts.production -l hdfs_masters --tags=hadoop_workers
4. Refresh the hdfs and jobtracker configuration
hdfs dfsadmin -refreshNodes
mapred mradmin -refreshNodes
5. Remove the node from the list of allowed ones
5a. Edit the inventory
5b. Run
./run.sh hadoop-common.yml -i inventory/hosts.production --tags=hadoop_workers
./run.sh mapred.yml -i inventory/hosts.production -l jt_masters --tags=hadoop_workers
./run.sh hadoop-hdfs.yml -i inventory/hosts.production -l hdfs_masters --tags=hadoop_workers
---------------------------------------------------------------------------------
Nagios monitoring
- The handlers to restart the services are managed via nrpe. To get them work, we need to:
- Add an entry in nrpe.cfg. The command name needs to start with "global_restart_" and
the remaining part of the name must coincide with the name of the service.
For example:
command[global_restart_hadoop-0.20-mapreduce-tasktracker]=/usr/bin/sudo /usr/sbin/service hadoop-0.20-mapreduce-tasktracker restart
- Add a handler to the nagios service. The command needs the service name as parameter
Example:
event_handler restart-service!hadoop-0.20-mapreduce-tasktracker
---------------------------------------------------------------------------------

@ -0,0 +1,32 @@
httpfs: We need to install it on only one machine (two for redundancy). Let's use the namenodes.
Move the second jobtracker on a dedicated machine.
hbase thrift: let's have two of them, on the nodes that run the hbase masters
Impala: needs to be installed on all the datanodes. After that, hue-impala can be installed on the hue server
NB: /etc/zookeeper/conf/zoo.cfg needs to be distributed on all
datanodes.
Create the new disks: lvcreate -l 238465 -n node11.t.hadoop.research-infrastructures.eu-data-hdfs dlibsan6 /dev/md3
# Move the data:
rsync -qaxvH --delete --numeric-ids /mnt/disk/ dlibsan7:/mnt/disk/
----------
dfs.socket.timeout, for read timeout
dfs.datanode.socket.write.timeout, for write timeout
In fact, the read timeout value is used for various connections in
DFSClient, if you only increase dfs.datanode.socket.write.timeout, the
timeout can continue to happen.
I tried to generate 1TB data with teragen across more than 40 data
nodes, increasing writing timeout has not fixed the problem. When I
increased both values above 600000, it disappeared.
----------
To configure yarn:
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.1/bk_installing_manually_book/content/rpm-chap1-11.html

@ -0,0 +1,2 @@
monitorRole readonly
controlRole readwrite

@ -0,0 +1,374 @@
---
# Generic machines data
time_zone: 'Europe/Rome'
cpu_cores: 8
datanode_ram: 11000
nagios_enabled: True
ganglia_enabled: False
ganglia_gmond_hdfs_datanodes_port: "8640:8660"
ganglia_gmond_jobtracker_port: "8640:8660"
ganglia_gmond_hbmaster_port: "8640:8660"
ganglia_gmond_namenode_port: "8640:8660"
configure_munin: True
# JDK (Oracle)
jdk_version:
- 7
- 8
jdk_default: 8
java_home: '/usr/lib/jvm/java-{{ jdk_default }}-oracle'
jdk_java_home: '{{ java_home }}'
# PKG state: latest or present. Set to 'latest' when you want to upgrade the installed packages version.
hadoop_pkg_state: present
#
#
# Global data
#
worker_nodes_num: 4
worker_node_start: 2
worker_node_end: 5
worker_node_swappiness: 0
dns_domain: t.hadoop.research-infrastructures.eu
namenode_hostname: 'nn1.{{ dns_domain }}'
secondary_nm_hostname: 'nn2.{{ dns_domain }}'
quorum_0_node_hostname: 'quorum0.{{ dns_domain }}'
quorum_1_node_hostname: 'quorum1.{{ dns_domain }}'
quorum_2_node_hostname: 'quorum2.{{ dns_domain }}'
quorum_3_node_hostname: 'quorum3.{{ dns_domain }}'
quorum_4_node_hostname: 'quorum4.{{ dns_domain }}'
hbase_master_1_hostname: 'hbase-master1.{{ dns_domain }}'
hbase_master_2_hostname: 'hbase-master2.{{ dns_domain }}'
ldap:
server: ldap://ldap.sub.research-infrastructures.eu
search_bind_auth: False
username_pattern: "uid=<username>,ou=People,o=Users,ou=Organizations,dc=research-infrastructures,dc=eu"
hadoop_ldap_uri: ldap://ldap.sub.research-infrastructures.eu
hadoop_ldap_base_dn: "dc=research-infrastructures,dc=eu"
hadoop_ldap_search_bind_auth: False
hadoop_ldap_username_pattern: "uid=<username>,ou=People,o=Users,ou=Organizations,dc=research-infrastructures,dc=eu"
#
# LOGGING
#
# WARN,INFO,DEBUG,ERROR
hadoop_log_level: INFO
#
# RFA is the rolling file appender
hadoop_log_appender: RFA
hadoop_log_appender_max_filesize: 256MB
# max backup index is ignored if the appender is daily rolling file
hadoop_log_appender_max_backupindex: 10
#
# We can use a logstash collector
hadoop_send_to_logstash: False
# Ditch the local appender if you want a logstash only solution
hadoop_logstash_appender: RFA,LOGSTASH
hadoop_logstash_collector_host: 'logstash.{{ dns_domain }}'
hadoop_logstash_collector_socketappender_port: 4560
hadoop_logstash_collector_socketappender_reconndelay: 10000
#
# rsyslog
rsyslog_install_newer_package: True
rsyslog_send_to_elasticsearch: False
rsyslog_use_queues: False
rsyslog_use_elasticsearch_module: False
rsys_elasticsearch_collector_host: '{{ hadoop_logstash_collector_host }}'
rsys_elasticsearch_collector_port: 9200
#
# General hadoop
#
initialize_hadoop_cluster: False
hadoop_cluster_name: "nmis-hadoop-cluster"
hadoop_data_dir: /data
hadoop_conf_dir: '/etc/hadoop/conf.{{ hadoop_cluster_name|lower }}'
hadoop_mapred_home: /usr/lib/hadoop-0.20-mapreduce
hadoop_hdfs_data_disk:
- { mountpoint: '/data', device: 'xvda3', fstype: 'xfs' }
#
# Hadoop default heapsize
# The default is 1000
hadoop_default_heapsize: 1024
hadoop_default_java_opts: "-server -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8"
hadoop_jmx_enabled: False
#
# HDFS
#
hdfs_cluster_id: '{{ hadoop_cluster_name }}'
hdfs_cluster_nn_id_1: nn1
hdfs_cluster_nn_id_2: nn2
hdfs_cluster_ids: "{{ hdfs_cluster_nn_id_1 }},{{ hdfs_cluster_nn_id_2 }}"
hdfs_namenode_1_hostname: '{{ namenode_hostname }}'
hdfs_namenode_2_hostname: '{{ secondary_nm_hostname }}'
hdfs_data_dir: '{{ hadoop_data_dir }}/dfs'
hdfs_nn_data_dir: nn
hdfs_dn_data_dir: dn
hdfs_dn_balance_bandwidthPerSec: 2097152
hdfs_support_append: "true"
hdfs_nn_rpc_port: 8020
hdfs_nn_http_port: 50070
hdfs_nn_client_port: 57045
# handler count. Recommended: ln(number of datanodes) * 20
hdfs_nn_handler_count: 50
# Recommended: up to 128MB, 134217728 bytes (this is the default, is a client parameter)
hdfs_block_size: 16777216
hdfs_repl_max: 256
hdfs_replication: 1
# Set to 0 to disable the trash use. Note that the client can enable it.
hdfs_fs_trash_interval: 10060
hdfs_datanode_max_xcievers: 1024
hdfs_datanode_http_port: 50075
hdfs_datanode_ipc_port: 50020
hdfs_datanode_rpc_port: 50010
hdfs_dfs_socket_timeout: 600000
hdfs_dfs_socket_write_timeout: 600000
# See http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_11_6.html
hdfs_read_shortcircuit: True
hdfs_read_shortcircuit_cache_size: 3000
hdfs_read_shortcircuit_cache_expiry: 50000
hdfs_read_shortcircuit_cache_dir: '/var/run/hadoop-hdfs'
hdfs_journal_id: '{{ hdfs_cluster_id }}'
hdfs_journal_port: 8485
hdfs_journal_0: '{{ quorum_0_node_hostname }}'
hdfs_journal_1: '{{ quorum_1_node_hostname }}'
hdfs_journal_2: '{{ quorum_2_node_hostname }}'
hdfs_journal_3: '{{ quorum_3_node_hostname }}'
hdfs_journal_4: '{{ quorum_4_node_hostname }}'
hdfs_journal_data_dir: jn
hdfs_journal_http_port: 8480
hdfs_zkfc_port: 8019
hdfs_webhdfs_enabled: True
hdfs_users_supergroup: supergroup
# The following is used to retrieve the ssh key needed for the HA failover
hdfs_user_home: /usr/lib/hadoop
httpfs_user: httpfs
httpfs_host: 'hue.{{ dns_domain }}'
httpfs_host_1: 'nn1.{{ dns_domain }}'
httpfs_host_2: 'nn2.{{ dns_domain }}'
httpfs_port: 14000
httpfs_catalina_work_dir: /usr/lib/hadoop-httpfs/work
#
# Zookeeper
zookeeper_conf_dir: '/etc/zookeeper/conf.{{ hadoop_cluster_name|lower }}'
zookeeper_log_dir: '/var/log/zookeeper'
zookeeper_client_port: 2182
zookeeper_quorum_port: 4182
zookeeper_leader_port: 3182
zookeeper_min_timeout: 30000
zookeeper_max_timeout: 240000
zookeeper_quorum_0: '{{ quorum_0_node_hostname }}'
zookeeper_quorum_1: '{{ quorum_1_node_hostname }}'
zookeeper_quorum_2: '{{ quorum_2_node_hostname }}'
zookeeper_quorum_3: '{{ quorum_3_node_hostname }}'
zookeeper_quorum_4: '{{ quorum_4_node_hostname }}'
zookeeper_maxclient_connections: 240
zookeeper_nodes: "{{ zookeeper_quorum_0 }},{{ zookeeper_quorum_1 }},{{ zookeeper_quorum_2 }},{{ zookeeper_quorum_3 }},{{ zookeeper_quorum_4 }}"
zookeeper_cluster: "{{ zookeeper_quorum_0 }}:{{ zookeeper_client_port }},{{ zookeeper_quorum_1 }}:{{ zookeeper_client_port }},{{ zookeeper_quorum_2 }}:{{ zookeeper_client_port }},{{ zookeeper_quorum_3 }}:{{ zookeeper_client_port }},{{ zookeeper_quorum_4 }}:{{ zookeeper_client_port }}"
#
# Jobtracker
#
jobtracker_cluster_id: nmis-hadoop-jt
jobtracker_node_1_hostname: 'jobtracker.{{ dns_domain }}'
jobtracker_node_2_hostname: 'jobtracker2.{{ dns_domain }}'
jobtracker_cluster_id_1: jt1
jobtracker_cluster_id_2: jt2
jobtracker_cluster_id1_rpc_port: 8021
jobtracker_cluster_id2_rpc_port: 8022
jobtracker_cluster_id1_ha_rpc_port: 8023
jobtracker_cluster_id2_ha_rpc_port: 8024
jobtracker_cluster_id1_http_port: 50030
jobtracker_cluster_id2_http_port: 50031
jobtracker_http_port: 9290
jobtracker_persistent_jobstatus: 'true'
jobtracker_restart_recover: 'false'
jobtracker_failover_connect_retries: 3
jobtracker_auto_failover_enabled: 'true'
jobtracker_zkfc_port: 8018
# handler count. Recommended: ln(number of datanodes) * 20
jobtracker_handler_count: 50
# We have 12 nodes and 6 CPUs per node
# reduce tasks forumla: 0.95 or 1.75 * (nodes * mapred.tasktracker.tasks.maximum)
# Cloudera defaults: 2 mappers, 2 reducers max
# ------
# tested. too much stress on the hardware
#mapred_tasktracker_map_tasks_maximum: 6
#mapred_tasktracker_reduce_tasks_maximum: 68
#mapred_reduce_child_java_opts: "-Xmx2G"
# ------
mapred_tasktracker_http_port: 50060
mapred_tasktracker_map_tasks_maximum: 2
mapred_tasktracker_reduce_tasks_maximum: 4
mapred_use_fair_scheduler: True
mapred_fair_scheduler_pools:
- { name: 'solr', map: '12', reduce: '18' }
mapred_fair_scheduler_use_poolnameproperty: True
mapred_fair_scheduler_poolnameproperty: user.name
mapred_fair_scheduler_undecl_pools: True
mapred_fair_scheduler_preemption: False
mapred_fair_scheduler_assignmultiple: True
mapred_fair_scheduler_allocation_file: '{{ hadoop_conf_dir }}/fair-scheduler.xml'
# reducer parallel copies. Recommended: ln(number of datanodes) * 4
# with a minimum of 10
mapred_reduce_parallel_copies: 10
# Recommended: 80
mapred_tasktracker_http_threads: 80
# Default: 0.05. Recommended: 0.8. Used by the jobtracker
mapred_reduce_slowstart_maps: 0.9
# Default: 100. We could increase it
mapred_tasktracker_io_sort_mb: 256
mapred_io_sort_factor: 25
mapreduce_job_counters_max: 5000
mapred_userlog_retain_hours: 24
mapred_jt_completeuserjobs_max: 150
mapred_jt_persist_jobstatus_hours: 4320
mapred_user_jobconf_limit: 5242880
mapred_jt_retirejob_interval: 86400000
mapreduce_jt_split_metainfo_maxsize: 10000000
mapred_queue_names: default
#
mapred_staging_root_dir: /user
mapred_old_staging_root_dir: /home
mapred_local_dir: /data/mapred/local
# Java parameters
mapred_child_java_opts: "-Xmx3092M"
mapred_map_child_java_opts: "-Xmx3092M"
#mapred_reduce_child_java_opts: "-Xmx1512M"
mapred_reduce_child_java_opts: "-Xmx2048M"
#
# HBASE
#
# Raw formula to calculate the needed regionserver heap size:
# regions.hbase.hregion.max.filesize /
# hbase.hregion.memstore.flush.size *
# dfs.replication *
# hbase.regionserver.global.memstore.lowerLimit
# See: http://hadoop-hbase.blogspot.it/2013/01/hbase-region-server-memory-sizing.html
#
hbase_user: hbase
hbase_conf_dir: '/etc/hbase/conf.{{ hadoop_cluster_name|lower }}'
# HBASE heap size
hbase_master_heap_size: 5120
hbase_thrift_heap_size: 1024
hbase_regionserver_heap_size: 4500
hbase_master_java_opts: '-Xmx{{ hbase_master_heap_size }}M'
hbase_regionserver_maxdirectmemory_size: "-XX:MaxDirectMemorySize=2G"
hbase_regionserver_java_opts: '-Xmx{{ hbase_regionserver_heap_size }}M'
hbase_thrift_java_opts: '-Xmx{{ hbase_thrift_heap_size }}M'
hbase_zookeeper_java_opts: -Xmx1G
hbase_thrift_port: 9090
hbase_thrift_jmx_port: 9591
# hbase zookeeper timeout
hbase_zookeeper_timeout: '{{ zookeeper_max_timeout }}'
# rpc timeout needs to be greater than lease period
# See http://hbase.apache.org/book/trouble.client.html
hbase_rpc_timeout: 600000
hbase_lease_period: 400000
hbase_open_files: 65536
hbase_master_rpc_port: 60000
hbase_master_http_port: 60010
hbase_regionserver_http_port: 60030
hbase_regionserver_http_1_port: 60020
# This is controversial. When set to 'true' hdfs balances
# each table without paying attention to the global balancing
hbase_loadbalance_bytable: True
# Default is 0.2
hbase_regions_slop: 0.15
# Default is 10. The recommendation is to keep it low when the payload per request grows
# We have mixed payloads.
hbase_handler_count: 12
# Default was 256M. It's 10737418240 (10GB) since 0.94
# The recommendation is to have it big to decrease the total number of regions
# 1288490188 is circa 1.2GB
hbase_hregion_max_file_size: 1288490188
hbase_hregion_memstore_mslab_enabled: True
# The default 134217728, 128MB. We set it to 256M
hbase_hregion_memstore_flush_size: 268435456
# The default is 0.4
hbase_regionserver_global_memstore_lowerLimit: 0.35
#
hbase_regionserver_global_memstore_upperLimit: 0.45
hbase_hregion_memstore_block_multiplier: 3
# HBASE thrift server
hbase_thrift_server_1: '{{ hbase_master_1_hostname }}'
hbase_thrift_server_2: '{{ hbase_master_2_hostname }}'
#
# nginx uses as reverse proxy to all the web interfaces
#
nginx_use_ldap_pam_auth: True
nginx_pam_svc_name: nginx
nginx_ldap_uri: '{{ hadoop_ldap_uri }}'
nginx_ldap_base_dn: '{{ hadoop_ldap_base_dn }}'
portal_nginx_conf: management-portal
portal_pam_svc_name: '{{ nginx_pam_svc_name }}'
portal_title: "NeMIS Hadoop Cluster"
portal_web_root: /usr/share/nginx/www
#
# OOZIE and HIVE DB data
#
oozie_db_type: postgresql
oozie_db_name: oozie
oozie_db_user: oozie
oozie_db_host: db.t.hadoop.research-infrastructures.eu
hive_db_type: '{{ oozie_db_type }}'
hive_db_name: hive
hive_db_user: hive
hive_db_host: '{{ oozie_db_host }}'
hive_metastore_db_type: '{{ oozie_db_type }}'
hive_metastore_db_name: metastore
hive_metastore_db_user: metastore
hive_metastore_db_host: '{{ oozie_db_host }}'
hue_db_type: '{{ oozie_db_type }}'
hue_db_name: hue
hue_db_user: hue
hue_db_host: '{{ oozie_db_host }}'
hue_http_port: 8888
oozie_ip: 146.48.123.66
hive_ip: '{{ oozie_ip }}'
hue_ip: '{{ oozie_ip }}'
# Iptables
other_networks:
# Marek
icm_pl: 213.135.59.0/24
# eri.katsari
icm_pl_1: 195.134.66.216/32
# Antonis addresses, need to reach hdfs and zookeeper (ARC). And Glykeria Katsari
ilsp_gr: [ '194.177.192.226/32', '194.177.192.223/32', '195.134.66.96/32', '194.177.192.218/32', '194.177.192.231/32', '195.134.66.216/32', '195.134.66.145/32', '194.177.192.118/32', '195.134.66.244' ]
# Needed by marek. It's the IIS cluster gateway.
iis_pl_1: 213.135.60.74/32
# Jochen
icm_1: 129.70.43.118/32
monitoring_group_name: hadoop-cluster
nagios_local_plugins_dir: /usr/lib/nagios/plugins/hadoop
nagios_common_lib: check_library.sh
nagios_monitoring_dir: '/etc/nagios3/objects/{{ monitoring_group_name }}'
nagios_root_disk: /
nagios_check_disk_w: 10%
nagios_check_disk_c: 7%
nagios_service_contacts:
- andrea.dellamico
- claudio.atzori
nagios_contactgroup: hadoop-managers
nagios_monitoring_server_ip: 146.48.123.23
iptables_default_policy: REJECT

@ -0,0 +1,27 @@
---
# Ganglia
ganglia_unicast_mode: False
ganglia_gmond_jobtracker_cluster: "Openaire+ Hadoop Cluster - Jobtrackers"
ganglia_gmond_namenode_cluster: "Openaire+ Hadoop Cluster - HDFS namenodes"
ganglia_gmond_hbmaster_cluster: "Openaire+ Hadoop Cluster - HBASE masters"
ganglia_gmond_workers_cluster: "Openaire+ Hadoop Cluster - Worker nodes"
ganglia_gmond_cluster: '{{ ganglia_gmond_workers_cluster }}'
#
# To play nice with iptables
ganglia_gmond_mcast_addr: 239.2.11.0
ganglia_gmond_cluster_port: "8640:8660"
# jmx ports
hadoop_namenode_jmx_port: 10103
hadoop_secondary_namenode_jmx_port: 10104
hadoop_datanode_jmx_port: 10105
hadoop_balancer_jmx_port: 10106
hadoop_jobtracker_jmx_port: 10107
hbase_master_jmx_port: 10101
hbase_regionserver_jmx_port: 10102
hbase_thrift_jmx_port: 10109
hbase_zookeeper_jmx_port: 10110
zookeeper_jmx_port: 10108

@ -0,0 +1,32 @@
---
# jmx ports
hadoop_namenode_jmx_port: 10103
hadoop_secondary_namenode_jmx_port: 10104
hadoop_datanode_jmx_port: 10105
hadoop_balancer_jmx_port: 10106
hadoop_jobtracker_jmx_port: 10107
hbase_master_jmx_port: 10101
hbase_regionserver_jmx_port: 10102
hbase_thrift_jmx_port: 10109
hbase_zookeeper_jmx_port: 10110
zookeeper_jmx_port: 10108
#
# Used by nagios
hadoop_plugins_dir: /usr/lib/nagios/plugins/hadoop
root_disk: /dev/xvda2
data_disk: /dev/xvda3
root_disk_warn: 20%
disk_warn: '{{ root_disk_warn }}'
root_disk_crit: 10%
disk_crit: '{{ root_disk_crit }}'
data_disk_warn: 7%
data_disk_crit: 4%
hbase_check_user: hbasecheck
hbase_check_timeout: 560
hdfs_warn: 90
hdfs_crit: 95
nagios_proclist_red: '{{ redprocs }}'
nagios_proclist_yellow: '{{ yellowprocs }}'
nagios_nrpe_port: 5666

@ -0,0 +1,23 @@
---
#
# The OOZIE users are a subset of the hdfs users.
#
hadoop_users:
- { login: 'marek.horst', name: "Marek Horst", ssh_key: '{{ marek_horst }}', shell: '/bin/bash' }
- { login: 'claudio.atzori', name: "Claudio Atzori", ssh_key: '{{ claudio_atzori }}', shell: '/bin/bash' }
- { login: 'sandro.labruzzo', name: "Sandro Labruzzo", ssh_key: '{{ sandro_labruzzo }}', shell: '/bin/bash' }
- { login: 'michele.artini', name: "Michele Artini", ssh_key: '{{ michele_artini }}', shell: '/bin/bash' }
- { login: 'alessia.bardi', name: "Alessia Bardi", ssh_key: '{{ alessia_bardi }}', shell: '/bin/bash' }
- { login: 'andrea.mannocci', name: "Andrea Mannocci", ssh_key: '{{ andrea_mannocci }}', shell: '/bin/bash' }
- { login: 'andrea.dellamico', name: "Andrea Dell'Amico", ssh_key: '{{ andrea_dellamico }}', shell: '/bin/bash' }
- { login: 'giorgos.alexiou', name: "Giorgos Alexiou", ssh_key: '{{ giorgos_alexiou }}', shell: '/bin/bash' }
- { login: 'antonis.lempesis', name: "Antonis Lempesis", ssh_key: '{{ antonis_lempesis }}', shell: '/bin/bash' }
- { login: 'dnet' }
- { login: 'claudio' }
- { login: 'michele' }
- { login: 'sandro' }
- { login: 'alessia' }
- { login: 'andrea' }
- { login: 'adellam' }
- { login: 'hbasecheck' }

@ -0,0 +1,6 @@
$ANSIBLE_VAULT;1.1;AES256
63613435386665626236306331353063626137386531346461646463623436376232303461653436
3934313830326366373364396630356630623935633230360a646439346530363762363966643534
30373331666537666266353666333632616465666331383231356661633838633432656536653233
3738636134393763650a623637326339653932323563346336366433333732373733656532353137
36306364343430303535373961646632656535666162363862613036356461343865

@ -0,0 +1,10 @@
$ANSIBLE_VAULT;1.1;AES256
39646636653439616665643935326563653435646462306639646266376232633436393834643933
3364336430396530646637383438663037366362663135320a373065343862653035653838323739
61646135626431643330363963666433303737663464396663353632646339653562666162393034
3363383435346364310a356439323431343336366635306461613462663436326431383266366231
39636262313038366135316331343939373064356336356239653631633435613736306131656363
37613864353931396435353431633765623330663266646666643632626666643436623939303538
34343461383338663466303131663336326230666532326335373862636437343739336136616435
35653763353436383537633932316434303539373237336161303165353962356336666161323765
6336

@ -0,0 +1,18 @@
---
psql_version: 9.1
psql_db_host: localhost
psql_db_data:
- { name: '{{ oozie_db_name }}', encoding: 'UTF8', user: '{{ oozie_db_user }}', roles: 'CREATEDB,NOSUPERUSER', pwd: '{{ psql_db_pwd }}', allowed_hosts: [ '{{ oozie_ip }}/32' ] }
- { name: '{{ hue_db_name }}', encoding: 'UTF8', user: '{{ hue_db_user }}', roles: 'CREATEDB,NOSUPERUSER', pwd: '{{ psql_db_pwd }}', allowed_hosts: [ '{{ hue_ip }}/32' ] }
- { name: '{{ hive_metastore_db_name }}', encoding: 'UTF8', user: '{{ hive_metastore_db_user }}', roles: 'CREATEDB,NOSUPERUSER', pwd: '{{ psql_db_pwd }}', allowed_hosts: [ '{{ hive_ip }}/32' ] }
psql_listen_on_ext_int: True
pg_backup_pgdump_bin: /usr/lib/postgresql/9.1/bin/pg_dump
pg_backup_retain_copies: 10
pg_backup_build_db_list: "no"
pg_backup_db_list: "'{{ oozie_db_name }}' '{{ hue_db_name }}' '{{ hive_metastore_db_name }}'"
pg_backup_destdir: /data/pgsql/backups
pg_backup_logfile: '{{ pg_backup_logdir }}/postgresql-backup.log'
pg_backup_use_nagios: "yes"
user_ssh_key: [ '{{ claudio_atzori }}' ]

@ -0,0 +1,2 @@
---
user_ssh_key: [ '{{ claudio_atzori }}', '{{ hadoop_test_cluster }}', '{{ sandro_labruzzo }}' ]

@ -0,0 +1,18 @@
---
#
# The hadoop logs are now sent to logstash directly by log4j
# - adellam 2015-02-04
#
# the log_state_file names must be unique when using the old rsyslog syntax. In the new one
# they are not used
# rsys_logfiles:
# - { logfile: '/var/log/hadoop-0.20-mapreduce/hadoop-{{ hadoop_cluster_name }}-jobtrackerha-{{ ansible_hostname }}.log', log_tag: 'hadoop-jobtracker', log_state_file: 'hadoop-jobtracker'}
# - { logfile: '/var/log/hadoop-0.20-mapreduce/hadoop-{{ hadoop_cluster_name }}-mrzkfc-{{ ansible_hostname }}.log', log_tag: 'hadoop-jt-mrzkfc', log_state_file: 'hadoop-jt-mrzkfc'}
# - { logfile: '/var/log/hadoop-0.20-mapreduce/mapred-audit.log', log_tag: 'hadoop-mapred-audit', log_state_file: 'hadoop-mapred-audit'}
# - { logfile: '/var/log/hadoop-hdfs/hadoop-{{ hadoop_cluster_name }}-namenode-{{ ansible_hostname }}.log', log_tag: 'hadoop-hdfs-namenode', log_state_file: 'hadoop-hdfs-namenode'}
# - { logfile: '/var/log/hadoop-hdfs/hadoop-{{ hadoop_cluster_name }}-zkfc-{{ ansible_hostname }}.log', log_tag: 'hadoop-hdfs-zkfc', log_state_file: 'hadoop-hdfs-zkfc'}
# - { logfile: '/var/log/hadoop-hdfs/hadoop-{{ hadoop_cluster_name }}-journalnode-{{ ansible_hostname }}.log', log_tag: 'hadoop-hdfs-journal', log_state_file: 'hadoop-hdfs-journal'}
# - { logfile: '/var/log/hbase/hbase.log', log_tag: 'hbase-master-log', log_state_file: 'hbase-master-log'}
# - { logfile: '/var/log/hbase/hbase-hbase-master-{{ ansible_hostname }}.log', log_tag: 'hbase-master-ha', log_state_file: 'hbase-master-ha'}
# - { logfile: '/var/log/hbase/hbase-hbase-thrift-{{ ansible_hostname }}.log', log_tag: 'hbase-thrift', log_state_file: 'hbase-thrift'}
# - { logfile: '{{ zookeeper_log_dir }}/zookeeper.log', log_tag: 'hadoop-zookeeper', log_state_file: 'hadoop-zookeeper'}

@ -0,0 +1,6 @@
---
# Ganglia gmond port
ganglia_gmond_cluster: '{{ ganglia_gmond_workers_cluster }}'
ganglia_gmond_cluster_port: '{{ ganglia_gmond_hdfs_datanodes_port }}'
ganglia_gmond_mcast_address: '{{ ganglia_gmond_workers_mcast_addr }}'

@ -0,0 +1,10 @@
---
iptables:
tcp_rules: True
tcp:
- { port: '{{ hdfs_datanode_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}' ] }
- { port: '{{ hdfs_datanode_ipc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}' ] }
- { port: '{{ hdfs_datanode_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
- { port: '{{ mapred_tasktracker_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ hbase_regionserver_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ hbase_regionserver_http_1_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }

@ -0,0 +1,10 @@
---
#
# The hadoop logs are now sent to logstash directly by log4j
# - adellam 2015-02-04
#
# IMPORTANT: the log_state_file names must be unique
# rsys_logfiles:
# - { logfile: '/var/log/hadoop-0.20-mapreduce/hadoop-{{ hadoop_cluster_name }}-tasktracker-{{ ansible_hostname }}.log', log_tag: 'hadoop-tasktracker', log_state_file: 'hadoop-tasktracker'}
# - { logfile: '/var/log/hadoop-hdfs/hadoop-{{ hadoop_cluster_name }}-datanode-{{ ansible_hostname }}.log', log_tag: 'hadoop-hdfs-datanode', log_state_file: 'hadoop-hdfs-datanode'}
# - { logfile: '/var/log/hbase/hbase-hbase-regionserver-{{ ansible_hostname }}.log', log_tag: 'hbase-regionserver', log_state_file: 'hbase-regionserver'}

@ -0,0 +1,6 @@
---
# Ganglia gmond port
ganglia_gmond_cluster: '{{ ganglia_gmond_hbmaster_cluster }}'
ganglia_gmond_cluster_port: '{{ ganglia_gmond_hbmaster_port }}'
ganglia_gmond_mcast_address: '{{ ganglia_gmond_hbmaster_mcast_addr }}'

@ -0,0 +1,12 @@
---
iptables:
tcp_rules: True
tcp:
- { port: '{{ hbase_master_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
- { port: '{{ hbase_master_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
- { port: '{{ hbase_thrift_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ hdfs_journal_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ hdfs_journal_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ zookeeper_leader_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ zookeeper_quorum_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ zookeeper_client_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }

@ -0,0 +1,6 @@
---
# Ganglia gmond port
ganglia_gmond_cluster: '{{ ganglia_gmond_namenode_cluster }}'
ganglia_gmond_cluster_port: '{{ ganglia_gmond_namenode_port }}'
ganglia_gmond_mcast_address: '{{ ganglia_gmond_namenode_mcast_addr }}'

@ -0,0 +1,13 @@
---
iptables:
tcp_rules: True
tcp:
- { port: '{{ hdfs_nn_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
- { port: '{{ hdfs_nn_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
- { port: '{{ hdfs_nn_client_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ hdfs_zkfc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ hdfs_journal_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ hdfs_journal_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ zookeeper_leader_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ zookeeper_quorum_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ zookeeper_client_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }

@ -0,0 +1,6 @@
---
# Ganglia gmond port
ganglia_gmond_cluster: '{{ ganglia_gmond_jobtracker_cluster }}'
ganglia_gmond_cluster_port: '{{ ganglia_gmond_jobtracker_port }}'
ganglia_gmond_mcast_address: '{{ ganglia_gmond_jobtracker_mcast_addr }}'

@ -0,0 +1,22 @@
---
iptables:
tcp_rules: True
tcp:
- { port: '80:95' }
- { port: '8100:8150' }
- { port: '8200:8250' }
- { port: '8300:8350' }
- { port: '8400:8450' }
- { port: '{{ jobtracker_cluster_id1_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_1 }}' ] }
- { port: '{{ jobtracker_cluster_id2_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_1 }}' ] }
- { port: '{{ jobtracker_cluster_id1_ha_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ jobtracker_cluster_id2_ha_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ jobtracker_cluster_id1_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.icm_1 }}' ] }
- { port: '{{ jobtracker_cluster_id2_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.icm_1 }}' ] }
- { port: '{{ jobtracker_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ jobtracker_zkfc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ hdfs_journal_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ hdfs_journal_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ zookeeper_leader_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ zookeeper_quorum_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '{{ zookeeper_client_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }

@ -0,0 +1,52 @@
---
user_ssh_key: [ '{{ claudio_atzori }}' ]
# logstash
logstash_collector_host: logstash.t.hadoop.research-infrastructures.eu
logstash_collector_listen_port: 5544
logstash_version: 1.3.3
logstash_file: 'logstash-{{ logstash_version }}-flatjar.jar'
logstash_url: 'download.elasticsearch.org/logstash/logstash/{{ logstash_file }}'
logstash_install_dir: /opt/logstash
logstash_conf_dir: '{{ logstash_install_dir }}/etc'
logstash_lib_dir: '{{ logstash_install_dir }}/share'
logstash_log_dir: /var/log/logstash
logstash_user: logstash
logstash_indexer_jvm_opts: "-Xms2048m -Xmx2048m"
kibana_nginx_conf: kibana
kinaba_nginx_root: /var/www/kibana/src
kibana_virtual_host: logs.t.hadoop.research-infrastructures.eu
elasticsearch_user: elasticsearch
elasticsearch_group: elasticsearch
elasticsearch_version: 1.0.0
elasticsearch_http_port: 9200
elasticsearch_transport_tcp_port: 9300
elasticsearch_download_path: download.elasticsearch.org/elasticsearch/elasticsearch
elasticsearch_cluster: hadoop-logstash
elasticsearch_node_name: logstash
elasticsearch_node_master: "true"
elasticsearch_node_data: "true"
elasticsearch_max_local_storage_nodes: 1
elasticsearch_log_dir: /var/log/elasticsearch
elasticsearch_heap_size: 5
elasticsearch_host: localhost
elasticsearch_curator_close_after: 10
elasticsearch_curator_retain_days: 20
elasticsearch_curator_optimize_days: 10
elasticsearch_curator_bloom_days: 7
elasticsearch_curator_timeout: 1200
elasticsearch_curator_manage_marvel: True
elasticsearch_disable_dynamic_scripts: True
# We use the nginx defaults here
nginx_use_ldap_pam_auth: True
iptables:
tcp_rules: True
tcp:
- { port: '{{ logstash.collector_listen_port }}', allowed_hosts: [ '{{ network.nmis }}' ] }
- { port: '{{ elasticsearch.http_port }}', allowed_hosts: [ '{{ ansible_fqdn }}', '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
- { port: '80', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }

@ -0,0 +1,43 @@
---
user_ssh_key: [ '{{ claudio_atzori }}', '{{ michele_artini }}' ]
iptables:
tcp_rules: True
tcp:
- { port: '11000', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_pl }}', '{{ other_networks.icm_pl_1 }}' ] }
- { port: '10000', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_pl }}', '{{ other_networks.icm_pl_1 }}' ] }
- { port: '9083', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_pl }}', '{{ other_networks.icm_pl_1 }}' ] }
- { port: '8888', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_pl }}', '{{ other_networks.icm_pl_1 }}', '0.0.0.0/0' ] }
oozie:
host: 'oozie.{{ dns_domain }}'
conf_dir: /etc/oozie/conf
user: oozie
catalina_work_dir: /usr/lib/oozie/oozie-server-0.20/work
http_port: 11000
#
# HIVE
#
hive:
host: 'hive.{{ dns_domain }}'
conf_dir: /etc/hive/conf
user: hive
metastore_port: 9083
server2_http_port: 10000
setugi: True
#
# HUE
#
hue:
user: hue
group: hue
host: 'hue.{{ dns_domain }}'
http_port: 8888
conf_dir: /etc/hue
hive_interface: hiveserver2
exec_path: /usr/share/hue/build/env/bin/hue
encoding: 'utf-8'
setuid_path: /usr/share/hue/apps/shell/src/shell/build/setuid

@ -0,0 +1,8 @@
$ANSIBLE_VAULT;1.1;AES256
35656164616131366466393935373064383333633237616435353030613234323463393363643961
6366343466396563666662396332666661636462313861630a376235623035633530656238623464
37636231343837363431396564363632343466306166343365356137646266656637313534353834
3561323334346135300a643731653463353564356332376162613864336539376530333534363032
36643532626433393939353030653762643636353331326565666164343761393533623461383165
33313736346537373364646332653538343034376639626335393065346637623664303264343237
326630336139303531346238383733633335

@ -0,0 +1,17 @@
---
- hosts: hadoop_worker_nodes:hadoop_masters
remote_user: root
max_fail_percentage: 10
serial: "25%"
# vars_files:
# - ../library/vars/isti-global.yml
roles:
- common
- cdh_common
- chkconfig
- hadoop_common
- hadoop_config
- hadoop_zookeeper
- hadoop_zookeeper_config

@ -0,0 +1,10 @@
[zookeeper_cluster]
quorum0.t.hadoop.research-infrastructures.eu zoo_id=0
quorum1.t.hadoop.research-infrastructures.eu zoo_id=1
quorum2.t.hadoop.research-infrastructures.eu zoo_id=2
quorum3.t.hadoop.research-infrastructures.eu zoo_id=3
quorum4.t.hadoop.research-infrastructures.eu zoo_id=4
[monitoring]
monitoring.research-infrastructures.eu

@ -0,0 +1,14 @@
---
- hosts: monitoring
user: root
vars_files:
- ../library/vars/isti-global.yml
roles:
- nagios-server
- hosts: hadoop_cluster:other_services:db
user: root
vars_files:
- ../library/vars/isti-global.yml
roles:
- nagios-monitoring

@ -0,0 +1,4 @@
---
dependencies:
- { role: ../library/roles/openjdk }
- role: '../../library/roles/ssh-keys'

@ -0,0 +1,14 @@
---
- name: Install the common CDH hadoop packages
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- hadoop
- hadoop-0.20-mapreduce
- hadoop-client
- hadoop-hdfs
- hadoop-mapreduce
tags:
- hadoop
- mapred
- hdfs

@ -0,0 +1,23 @@
---
- name: Install the D-NET repository key
action: apt_key url=http://ppa.research-infrastructures.eu/dnet/keys/dnet-archive.asc
tags:
- hadoop
- cdh
- name: Install the CDH repository key
action: apt_key url=http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/archive.key
tags:
- hadoop
- cdh
- apt_repository: repo='{{ item }}' update_cache=yes
with_items:
- deb http://ppa.research-infrastructures.eu/dnet lucid main
- deb http://ppa.research-infrastructures.eu/dnet unstable main
- deb [arch=amd64] http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4 contrib
- deb [arch=amd64] http://archive.cloudera.com/gplextras/ubuntu/precise/amd64/gplextras precise-gplextras4 contrib
register: update_apt_cache
tags:
- hadoop
- cdh

@ -0,0 +1,4 @@
---
- import_tasks: cdh-setup.yml
- import_tasks: cdh-pkgs.yml
# See meta/main.yml for the involved library playbooks

@ -0,0 +1,4 @@
---
dependencies:
- role: '../../library/roles/ubuntu-deb-general'
- { role: '../../library/roles/iptables', when: iptables is defined }

@ -0,0 +1,2 @@
---
# See meta/main.yml for the involved library playbooks

@ -0,0 +1,3 @@
---
dependencies:
- role: '../../library/roles/ganglia'

@ -0,0 +1,31 @@
---
# See meta/main.yml for the basic installation and configuration steps
# The hadoop conf directory always exists
- name: Distribute the ganglia hadoop metrics properties
template: src={{ item }}.j2 dest={{ hadoop_conf_dir }}/{{ item }} owner=root group=root mode=444
with_items:
- hadoop-metrics.properties
- hadoop-metrics2.properties
tags: [ 'monitoring', 'ganglia', 'ganglia_conf' ]
- name: Check if the hbase conf directory exists
stat: path={{ hbase_conf_dir }}
register: check_hbase_confdir
tags: [ 'monitoring', 'ganglia', 'ganglia_conf' ]
- name: Distribute the ganglia hbase metrics properties
template: src={{ item }}.properties.j2 dest={{ hbase_conf_dir }}/{{ item }}-hbase.properties owner=root group=root mode=444
with_items:
- hadoop-metrics
- hadoop-metrics2
when: check_hbase_confdir.stat.exists
tags: [ 'monitoring', 'ganglia', 'ganglia_conf' ]
- name: Distribute the ganglia hbase metrics properties, maintain the old file name
file: src={{ hbase_conf_dir }}/{{ item }}-hbase.properties dest={{ hbase_conf_dir }}/{{ item }}.properties state=link force=yes
with_items:
- hadoop-metrics
- hadoop-metrics2
when: check_hbase_confdir.stat.exists
tags: [ 'monitoring', 'ganglia', 'ganglia_conf' ]

@ -0,0 +1,96 @@
# Configuration of the "dfs" context for null
dfs.class=org.apache.hadoop.metrics.spi.NullContext
# Configuration of the "dfs" context for file
#dfs.class=org.apache.hadoop.metrics.file.FileContext
#dfs.period=10
#dfs.fileName=/tmp/dfsmetrics.log
# Configuration of the "dfs" context for ganglia
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# dfs.period=10
# dfs.servers=localhost:8649
# Configuration of the "mapred" context for null
mapred.class=org.apache.hadoop.metrics.spi.NullContext
# Configuration of the "mapred" context for file
#mapred.class=org.apache.hadoop.metrics.file.FileContext
#mapred.period=10
#mapred.fileName=/tmp/mrmetrics.log
# Configuration of the "mapred" context for ganglia
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# mapred.period=10
# mapred.servers=localhost:8649
# Configuration of the "jvm" context for null
#jvm.class=org.apache.hadoop.metrics.spi.NullContext
# Configuration of the "jvm" context for file
#jvm.class=org.apache.hadoop.metrics.file.FileContext
#jvm.period=10
#jvm.fileName=/tmp/jvmmetrics.log
# Configuration of the "jvm" context for ganglia
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# jvm.period=10
# jvm.servers=localhost:8649
# Configuration of the "rpc" context for null
rpc.class=org.apache.hadoop.metrics.spi.NullContext
# Configuration of the "rpc" context for file
#rpc.class=org.apache.hadoop.metrics.file.FileContext
#rpc.period=10
#rpc.fileName=/tmp/rpcmetrics.log
# Configuration of the "rpc" context for ganglia
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# rpc.period=10
# rpc.servers=localhost:8649
# Configuration of the "ugi" context for null
ugi.class=org.apache.hadoop.metrics.spi.NullContext
# Configuration of the "ugi" context for file
#ugi.class=org.apache.hadoop.metrics.file.FileContext
#ugi.period=10
#ugi.fileName=/tmp/ugimetrics.log
# Configuration of the "ugi" context for ganglia
# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext
# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# ugi.period=10
# ugi.servers=localhost:8649
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# dfs.period=10
# dfs.servers={{ hdfs_namenode_1_hostname }}:{{ ganglia_gmond_namenode_port }}
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# mapred.period=10
# mapred.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
# hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# hbase.period=10
# hbase.servers={{ hbase_master_1_hostname }}:{{ ganglia_gmond_cluster_port }}
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# jvm.period=10
# jvm.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# rpc.period=10
# rpc.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# ugi.period=10
# ugi.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
# fairscheduler.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
# fairscheduler.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}

@ -0,0 +1,34 @@
# Ganglia 3.1+ support
*.period=60
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
*.sink.ganglia.period=10
# default for supportsparse is false
*.sink.ganglia.supportsparse=true
*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both
*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40
namenode.sink.ganglia.servers={{ hdfs_namenode_1_hostname }}:{{ ganglia_gmond_namenode_port }},{{ hdfs_namenode_2_hostname }}:{{ ganglia_gmond_namenode_port }}
datanode.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
jobtracker.sink.ganglia.servers={{ jobtracker_node_1_hostname }}:{{ ganglia_gmond_jobtracker_port }},{{ jobtracker_node_2_hostname }}:{{ ganglia_gmond_jobtracker_port }}
#tasktracker.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_tasktracker_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_tasktracker_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_tasktracker_port }}
#maptask.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_maptask_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_maptask_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_maptask_port }}
#reducetask.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_reducetask_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_reducetask_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_reducetask_port }}
tasktracker.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
maptask.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
reducetask.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
hbase.extendedperiod = 3600
hbase.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
hbase.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
#hbase.sink.ganglia.servers={{ hbase_master_1_hostname }}:{{ ganglia_gmond_hbmaster_port }},{{ hbase_master_2_hostname }}:{{ ganglia_gmond_hbmaster_port }}
#hbase.servers={{ hbase_master_1_hostname }}:{{ ganglia_gmond_hbmaster_port }},{{ hbase_master_2_hostname }}:{{ ganglia_gmond_hbmaster_port }}
#resourcemanager.sink.ganglia.servers=
#nodemanager.sink.ganglia.servers=
#historyserver.sink.ganglia.servers=
#journalnode.sink.ganglia.servers=
#nimbus.sink.ganglia.servers=
#supervisor.sink.ganglia.servers=
#resourcemanager.sink.ganglia.tagsForPrefix.yarn=Queue

@ -0,0 +1,43 @@
---
- name: Directory for hdfs root under /data
file: dest={{ hdfs_data_dir }} state=directory
tags:
- hadoop
- mapred
- hdfs
# TODO: split and move to more specific roles.
- name: Directories for the hdfs services
file: dest={{ hdfs_data_dir}}/{{ item }} state=directory owner=hdfs group=hdfs mode=700
with_items:
- '{{ hdfs_dn_data_dir }}'
- '{{ hdfs_journal_data_dir }}'
tags:
- hadoop
- mapred
- hdfs
- name: Directories for mapred under /data/mapred
file: dest=/data/mapred state=directory
tags:
- hadoop
- mapred
- hdfs
- name: Directories for mapred under /data/mapred
file: dest=/data/mapred/{{ item }} state=directory owner=mapred group=hadoop mode=700
with_items:
- jt
- local
tags:
- hadoop
- mapred
- hdfs
- name: JMX secrets directory
file: dest=/etc/hadoop-jmx/conf state=directory owner=hdfs group=root mode=0750
when: hadoop_jmx_enabled
tags:
- hadoop
- jmx

@ -0,0 +1,8 @@
node13.t.hadoop.research-infrastructures.eu
node12.t.hadoop.research-infrastructures.eu
node11.t.hadoop.research-infrastructures.eu
node10.t.hadoop.research-infrastructures.eu
node9.t.hadoop.research-infrastructures.eu
node8.t.hadoop.research-infrastructures.eu
node7.t.hadoop.research-infrastructures.eu
node6.t.hadoop.research-infrastructures.eu

@ -0,0 +1,39 @@
---
- name: Restart HDFS namenode
service: name=hadoop-hdfs-namenode state=restarted sleep=20
ignore_errors: true
- name: Restart HDFS journalnode
service: name=hadoop-hdfs-journalnode state=restarted sleep=20
ignore_errors: true
- name: Restart HDFS datanode
service: name=hadoop-hdfs-datanode state=restarted sleep=20
ignore_errors: true
- name: Restart HDFS httpfs
service: name=hadoop-httpfs state=restarted sleep=20
ignore_errors: true
- name: Disable HDFS httpfs
service: name=hadoop-httpfs state=stopped enabled=no
ignore_errors: true
- name: Enable HDFS httpfs
service: name=hadoop-httpfs state=started enabled=yes
ignore_errors: true
- name: Refresh HDFS datanodes
become: True
become_user: hdfs
command: hdfs dfsadmin -refreshNodes
ignore_errors: true
- name: Restart mapreduce jobtracker
service: name=hadoop-0.20-mapreduce-jobtrackerha state=restarted sleep=20
ignore_errors: true
- name: Restart mapreduce tasktracker
service: name=hadoop-0.20-mapreduce-tasktracker state=restarted sleep=20
ignore_errors: true

@ -0,0 +1,34 @@
---
# Base environment for all the hadoop services
- name: Base environment for all the hadoop services
template: src=templates/bigtop-utils.default.j2 dest=/etc/default/bigtop-utils owner=root group=root mode=444
tags: [ 'hadoop', 'configuration' ]
- name: copy /etc/hadoop/conf.empty to {{ hadoop_conf_dir }}
command: creates={{ hadoop_conf_dir }} cp -R -p /etc/hadoop/conf.empty {{ hadoop_conf_dir }}
tags: [ 'hadoop', 'configuration' ]
- name: run 'update-alternatives' to set our hadoop configuration directory
alternatives: name=hadoop-conf link=/etc/hadoop/conf path={{ hadoop_conf_dir }}
tags: [ 'hadoop', 'configuration' ]
- name: Install the common configuration files
template: src={{ item }}.j2 dest={{ hadoop_conf_dir }}/{{ item }} owner=root group=root mode=444
with_items:
- slaves
- masters
- log4j.properties
tags: [ 'hadoop', 'configuration', 'log4j', 'hadoop_workers' ]
- name: Install the shared configuration files
template: src=templates/{{ item }}.j2 dest={{ hadoop_conf_dir }}/{{ item }} owner=root group=root mode=444
with_items:
- hadoop-env.sh
tags: [ 'hadoop', 'configuration' ]
- name: Install the mapreduce scheduler configuration file
template: src=templates/{{ item }}.j2 dest={{ hadoop_conf_dir }}/{{ item }} owner=root group=root mode=444
with_items:
- fair-scheduler.xml
when: mapred_use_fair_scheduler
tags: [ 'hadoop', 'configuration', 'scheduler' ]

@ -0,0 +1,238 @@
# Copyright 2011 The Apache Software Foundation
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Define some default values that can be overridden by system properties
#hadoop.root.logger=INFO,console
{% if hadoop_send_to_logstash %}
hadoop.root.logger={{ hadoop_log_level }},{{ hadoop_logstash_appender }}
{% else %}
hadoop.root.logger={{ hadoop_log_level }},{{ hadoop_log_appender }}
{% endif %}
hadoop.log.dir=.
hadoop.log.file=hadoop.log
# Define the root logger to the system property "hadoop.root.logger".
log4j.rootLogger=${hadoop.root.logger}, EventCounter
# Logging Threshold
log4j.threshold=ALL
# Null Appender
log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender
#
# Rolling File Appender - cap space usage at 5gb.
#
hadoop.log.maxfilesize={{ hadoop_log_appender_max_filesize }}
hadoop.log.maxbackupindex={{ hadoop_log_appender_max_backupindex }}
log4j.appender.RFA=org.apache.log4j.RollingFileAppender
log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file}
log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize}
log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex}
log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
# Pattern format: Date LogLevel LoggerName LogMessage
log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
# Debugging Pattern format
#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
#
# Daily Rolling File Appender
#
log4j.appender.DRFA=org.apache.log4j.RollingFileAppender
log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
# Rollver at midnight
#log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
log4j.appender.DRFA.MaxFileSize=${hadoop.log.maxfilesize}
log4j.appender.DRFA.MaxBackupIndex=${hadoop.log.maxbackupindex}
log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
# Pattern format: Date LogLevel LoggerName LogMessage
log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
# Debugging Pattern format
#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
#
# console
# Add "console" to rootlogger above if you want to use this
#
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
#
# TaskLog Appender
#
#Default values
hadoop.tasklog.taskid=null
hadoop.tasklog.iscleanup=false
hadoop.tasklog.noKeepSplits=4
hadoop.tasklog.totalLogFileSize=100
hadoop.tasklog.purgeLogSplits=true
hadoop.tasklog.logsRetainHours=12
log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender
log4j.appender.TLA.taskId=${hadoop.tasklog.taskid}
log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup}
log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize}
log4j.appender.TLA.layout=org.apache.log4j.PatternLayout
log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
#
# HDFS block state change log from block manager
#
# Uncomment the following to suppress normal block state change
# messages from BlockManager in NameNode.
#log4j.logger.BlockStateChange=WARN
#
#Security appender
#
#hadoop.security.logger={{ hadoop_log_level }},NullAppender
hadoop.security.logger={{ hadoop_log_level }},RFAS
hadoop.security.log.maxfilesize=64MB
hadoop.security.log.maxbackupindex=4
log4j.category.SecurityLogger=${hadoop.security.logger}
hadoop.security.log.file=SecurityAuth-${user.name}.audit
log4j.appender.RFAS=org.apache.log4j.RollingFileAppender
log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file}
log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout
log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize}
log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex}
#
# (fake) Daily Rolling Security appender
#
log4j.appender.DRFAS=org.apache.log4j.RollingFileAppender
log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file}
log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout
log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
#log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd
log4j.appender.DRFAS.MaxFileSize=${hadoop.security.log.maxfilesize}
log4j.appender.DRFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex}
#
# hdfs audit logging
#
hdfs.audit.logger={{ hadoop_log_level }},RFAAUDIT
hdfs.audit.log.maxfilesize=64MB
hdfs.audit.log.maxbackupindex=4
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger}
log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false
log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender
log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log
log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize}
log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex}
#
# mapred audit logging
#
mapred.audit.logger={{ hadoop_log_level }},MRAUDIT
mapred.audit.log.maxfilesize=64MB
mapred.audit.log.maxbackupindex=4
log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger}
log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false
log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender
log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log
log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize}
log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex}
# Custom Logging levels
log4j.logger.org.apache.hadoop.mapred.JobTracker={{ hadoop_log_level }}
log4j.logger.org.apache.hadoop.mapred.TaskTracker={{ hadoop_log_level }}
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit={{ hadoop_log_level }}
# Jets3t library
log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR
#
# Event Counter Appender
# Sends counts of logging messages at different severity levels to Hadoop Metrics.
#
log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter
#
# Job Summary Appender
#
# Use following logger to send summary to separate file defined by
# hadoop.mapreduce.jobsummary.log.file :
# hadoop.mapreduce.jobsummary.logger={{ hadoop_log_level }},JSA
#
hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger}
hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log
hadoop.mapreduce.jobsummary.log.maxfilesize=256MB
hadoop.mapreduce.jobsummary.log.maxbackupindex=20
log4j.appender.JSA=org.apache.log4j.RollingFileAppender
log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file}
log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize}
log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex}
log4j.appender.JSA.layout=org.apache.log4j.PatternLayout
log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger}
log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false
#
# Yarn ResourceManager Application Summary Log
#
# Set the ResourceManager summary log filename
#yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
# Set the ResourceManager summary log level and appender
#yarn.server.resourcemanager.appsummary.logger={{ hadoop_log_level }},RMSUMMARY
# Appender for ResourceManager Application Summary Log
# Requires the following properties to be set
# - hadoop.log.dir (Hadoop Log directory)
# - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename)
# - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender)
#log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger}
#log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false
#log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender
#log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file}
#log4j.appender.RMSUMMARY.MaxFileSize=256MB
#log4j.appender.RMSUMMARY.MaxBackupIndex=20
#log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout
#log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
{% if hadoop_send_to_logstash %}
log4j.appender.LOGSTASH=org.apache.log4j.net.SocketAppender
log4j.appender.LOGSTASH.remoteHost={{ hadoop_logstash_collector_host }}
log4j.appender.LOGSTASH.port={{ hadoop_logstash_collector_socketappender_port }}
log4j.appender.LOGSTASH.ReconnectionDelay={{ hadoop_logstash_collector_socketappender_reconndelay }}
log4j.appender.LOGSTASH.LocationInfo=true
log4j.appender.LOGSTASH.layout = org.apache.log4j.PatternLayout
log4j.appender.LOGSTASH.layout.ConversionPattern = %d [%t] %-5p %c- %m%n
{% endif %}

@ -0,0 +1,3 @@
{% for host in groups['hdfs_masters'] %}
{{ host }}
{% endfor %}

@ -0,0 +1,3 @@
{% for host in groups['hadoop_worker_nodes'] %}
{{ host }}
{% endfor %}

@ -0,0 +1,9 @@
---
- name: Install the HDFS datanode packages
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- hadoop-hdfs-datanode
tags:
- hadoop
- hdfs

@ -0,0 +1,8 @@
---
# TODO: find a better condition for the restart
#
- name: Restart HDFS datanode
service: name=hadoop-hdfs-datanode state=restarted sleep=20
when: ansible_fqdn | match ("node.*\.t\.hadoop\.research\-infrastructures\.eu")
ignore_errors: true

@ -0,0 +1,69 @@
---
# HDFS datanode
- name: Ensure that the hdfs data disks are mounted
mount: name={{ item.mountpoint }} src=/dev/{{ item.device }} fstype={{ item.fstype }} state=mounted
with_items: hadoop_hdfs_data_disk
when: ansible_fqdn | match ("node.*\.t\.hadoop\.research\-infrastructures\.eu")
tags:
- hdfs
- datanodes
- disk
- name: Set the vm.swappiness parameter
sysctl: name=vm.swappiness value={{ worker_node_swappiness }} sysctl_file=/etc/sysctl.d/90-swappiness.conf state=present reload=yes
when: ansible_fqdn | match ("node.*\.t\.hadoop\.research\-infrastructures\.eu")
tags:
- hdfs
- datanodes
- name: Install the HDFS datanode config files.
template: src=templates/{{ item }}.j2 dest={{ hadoop_conf_dir }}/{{ item }} owner=root group=root mode=0444
with_items:
- core-site.xml
notify:
- Restart HDFS datanode
tags:
- datanode
- hdfs
- name: Ensure that the hdfs shortcircuit cache directory exists
file: dest={{ hdfs_read_shortcircuit_cache_dir }} state=directory owner=hdfs group=hdfs mode=0755
when:
- ansible_fqdn | match ("node.*\.t\.hadoop\.research\-infrastructures\.eu")
- hdfs_read_shortcircuit
tags:
- datanode
- hdfs
- name: Install the HDFS datanode config files.
template: src=templates/datanode-hdfs-site.xml.j2 dest={{ hadoop_conf_dir }}/hdfs-site.xml owner=root group=root mode=0444
notify:
- Restart HDFS datanode
tags:
- datanode
- hdfs
- hdfs_site
- name: Check if the hadoop-jmx conf directory exists
stat: path=/etc/hadoop-jmx/conf
register: check_jmx_confdir
- name: Distribute the jmx authorization files for hadoop hdfs
template: src=templates/jmxremote.passwd.j2 dest=/etc/hadoop-jmx/conf/jmxremote.passwd owner=hdfs mode=0600
when: check_jmx_confdir.stat.exists and hadoop_jmx_enabled
notify:
- Restart HDFS datanode
tags:
- hadoop
- hdfs
- jmx
- name: Distribute the jmx role files for hadoop hdfs
copy: src=files/jmxremote.access dest=/etc/hadoop-jmx/conf/jmxremote.access owner=root mode=0644
when: check_jmx_confdir.stat.exists and hadoop_jmx_enabled
notify:
- Restart HDFS datanode
tags:
- hadoop
- hdfs
- jmx

@ -0,0 +1,4 @@
---
- name: Start HDFS journal
service: name=hadoop-hdfs-journalnode state=started enabled=yes
ignore_errors: True

@ -0,0 +1,19 @@
---
- name: Directories for the hdfs journal under /data/dfs
file: dest={{ hdfs_data_dir}}/{{ item }} state=directory owner=hdfs group=hdfs mode=700
with_items:
- '{{ hdfs_journal_data_dir }}'
tags:
- hdfs
- journal
- name: Install the journalnode server package
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- hadoop-hdfs-journalnode
notify:
- Start HDFS journal
tags:
- hdfs
- journal

@ -0,0 +1,9 @@
---
- name: Install the common CDH hadoop packages
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- hadoop-hdfs-namenode
tags:
- hadoop
- hdfs

@ -0,0 +1,8 @@
node13.t.hadoop.research-infrastructures.eu
node12.t.hadoop.research-infrastructures.eu
node11.t.hadoop.research-infrastructures.eu
node10.t.hadoop.research-infrastructures.eu
node9.t.hadoop.research-infrastructures.eu
node8.t.hadoop.research-infrastructures.eu
node7.t.hadoop.research-infrastructures.eu
node6.t.hadoop.research-infrastructures.eu

@ -0,0 +1,5 @@
---
- name: Restart HDFS namenode
service: name=hadoop-hdfs-namenode state=restarted sleep=20
ignore_errors: true

@ -0,0 +1,63 @@
---
# HDFS namenode
- name: Install the hdfs config files.
template: src=templates/{{ item }}.j2 dest={{ hadoop_conf_dir }}/{{ item }} owner=root group=root mode=0444
with_items:
- core-site.xml
notify:
- Restart HDFS namenode
tags:
- namenode
- hdfs
- name: Install the hdfs config files.
template: src=templates/namenode-{{ item }}.j2 dest={{ hadoop_conf_dir }}/{{ item }} owner=root group=root mode=0444
with_items:
- hdfs-site.xml
notify:
- Restart HDFS namenode
tags:
- namenode
- hdfs
- hdfs_site
- name: Install the dfs hosts allow file
template: src=dfs_hosts_allow.txt.j2 dest={{ hadoop_conf_dir }}/dfs_hosts_allow.txt owner=root group=root mode=0444
notify:
- Restart HDFS namenode
tags:
- namenode
- hdfs
- hadoop_workers
- name: Install the dfs hosts exclude file
copy: src=dfs_hosts_exclude.txt dest={{ hadoop_conf_dir }}/dfs_hosts_exclude.txt owner=root group=root mode=0444
tags:
- namenode
- hdfs
- hadoop_workers
- hadoop_exclude
- name: Check if the hadoop-jmx conf directory exists
stat: path=/etc/hadoop-jmx/conf
register: check_jmx_confdir
- name: Distribute the jmx authorization files for hadoop hdfs
template: src=templates/jmxremote.passwd.j2 dest=/etc/hadoop-jmx/conf/jmxremote.passwd owner=hdfs mode=0600
when: check_jmx_confdir.stat.exists and hadoop_jmx_enabled
notify:
- Restart HDFS namenode
tags:
- hadoop
- hdfs
- jmx
- name: Distribute the jmx role files for hadoop hdfs
copy: src=files/jmxremote.access dest=/etc/hadoop-jmx/conf/jmxremote.access owner=root mode=0644
when: check_jmx_confdir.stat.exists and hadoop_jmx_enabled
notify:
- Restart HDFS namenode
tags:
- hadoop
- hdfs
- jmx

@ -0,0 +1,3 @@
{% for host in groups['hadoop_worker_nodes'] %}
{{ host }}
{% endfor %}

@ -0,0 +1,20 @@
---
# Manage the hdfs ssh keys used by the HDFS HA
- name: Create a ssh key for the hdfs user. Needed by NN automatic failover
user: name=hdfs generate_ssh_key=yes ssh_key_type=rsa ssh_key_bits=2048
tags:
- hdfs-ssh
- name: Fetch the ssh public key. Needed to populate authorized_keys
fetch: src=/usr/lib/hadoop/.ssh/id_rsa.pub dest=/var/tmp/prefix-hdfs-{{ ansible_fqdn }}-id_rsa.pub fail_on_missing=yes flat=yes
tags:
- hdfs-ssh
- name: Authorize the hdfs user ssh key. Needed by NN automatic failover
authorized_key: user=hdfs key="{{ lookup('file', '/var/tmp/prefix-hdfs-{{ item }}.t.hadoop.research-infrastructures.eu-id_rsa.pub') }}"
with_items:
- nn1
- nn2
tags:
- hdfs-ssh

@ -0,0 +1,13 @@
---
- name: Activate the users on hdfs, with the right permissions.
become: yes
become_user: hdfs
shell: . /etc/profile.d/jdk.sh ; hadoop fs -mkdir {{ mapred_staging_root_dir }}/{{ item.login }} ; hadoop fs -chown {{ item.login }}:{{ item.login }} {{ mapred_staging_root_dir }}/{{ item.login }} ; hadoop fs -chmod 755 {{ mapred_staging_root_dir }}/{{ item.login }} ; touch /var/lib/hadoop-hdfs/.{{ item.login }}
with_items: '{{ hadoop_users }}'
args:
creates: '/var/lib/hadoop-hdfs/.{{ item.login }}'
register: create_user
tags:
- hdfs
- users
- hadoop_users

@ -0,0 +1,10 @@
---
- name: Install the common CDH hadoop packages
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- hadoop-hdfs-zkfc
tags:
- hadoop
- hdfs
- zkfc

@ -0,0 +1,44 @@
---
- name: Create a /tmp directory on HDFS
become: True
become_user: hdfs
command: hadoop fs -mkdir /tmp
tags:
- hadoop
- hdfs
- name: Fix the HDFS /tmp directory permissions
command: hadoop fs -chmod -R 1777 /tmp
tags:
- hadoop
- hdfs
- name: Create a /user/history directory on HDFS
command: hadoop fs -mkdir /user/history
tags:
- hadoop
- hdfs
- name: Fix the HDFS /user/history permissions
command: hadoop fs -chmod -R 1777 /user/history
tags:
- hadoop
- hdfs
- name: Fix the /user/history directory owner and group
command: hadoop fs -chown mapred:supergroup /user/history
tags:
- hadoop
- hdfs
- name: Create the /hbase directory on HDFS
command: hadoop fs -mkdir /hbase
tags:
- hadoop
- hdfs
- name: Fix the /hbase directory ownership
command: hadoop fs -chown hbase:hbase /hbase
tags:
- hadoop
- hdfs

@ -0,0 +1,19 @@
---
- name: Create the 'supergroup' user group
group: name={{ hdfs_users_supergroup }} state=present
tags:
- hadoop
- hdfs
- name: Create the /data/dfs directory on the NN filesystem
file: path={{ hdfs_data_dir }} owner=root group=root state=directory
tags:
- hadoop
- hdfs
- name: Create the /data/dfs/nn directory on the NN filesystem
file: path={{ hdfs_data_dir }}/{{ hdfs_nn_data_dir }} owner=hdfs group=hdfs state=directory
tags:
- hadoop
- hdfs

@ -0,0 +1,9 @@
---
- name: Format the namenode, if it\'s not already formatted. Runs on the first namenode only
become: True
become_user: hdfs
command: creates={{ hdfs_data_dir }}/{{ hdfs_nn_data_dir }}/current/VERSION hdfs namenode -format -force
tags:
- hadoop
- hdfs

@ -0,0 +1,15 @@
---
- name: Wait for the first namenode before doing anything
wait_for: host={{ hostvars[groups['hdfs_primary_master'][0]].ipv4_address|default(hostvars[groups['hdfs_primary_master'][0]].ansible_default_ipv4.address) }} port={{ hdfs_nn_http_port }}
tags:
- hadoop
- hdfs
- name: Bootstrap the second namenode
become: True
become_user: hdfs
command: hdfs namenode -bootstrapStandby
tags:
- hadoop
- hdfs

@ -0,0 +1,6 @@
---
- name: Restart hdfs namenode
service: name=hadoop-hdfs-namenode state=restarted
- name: Restart hdfs zkfc
service: name=hadoop-hdfs-zkfc state=restarted

@ -0,0 +1,11 @@
---
- name: Format zkfc
become: True
become_user: hdfs
command: hdfs zkfc -formatZK -force
notify:
- Restart hdfs zkfc
- Restart hdfs namenode
tags:
- hadoop
- hdfs

@ -0,0 +1,12 @@
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.queue.default.acl-submit-job</name>
<value>*</value>
</property>
<property>
<name>mapred.queue.default.acl-administer-jobs</name>
<value> </value>
</property>
</configuration>

@ -0,0 +1,8 @@
node13.t.hadoop.research-infrastructures.eu
node12.t.hadoop.research-infrastructures.eu
node11.t.hadoop.research-infrastructures.eu
node10.t.hadoop.research-infrastructures.eu
node9.t.hadoop.research-infrastructures.eu
node8.t.hadoop.research-infrastructures.eu
node7.t.hadoop.research-infrastructures.eu
node6.t.hadoop.research-infrastructures.eu

@ -0,0 +1,117 @@
<?xml version="1.0"?>
<!-- This is the configuration file for the resource manager in Hadoop. -->
<!-- You can configure various scheduling parameters related to queues. -->
<!-- The properties for a queue follow a naming convention,such as, -->
<!-- mapred.capacity-scheduler.queue.<queue-name>.property-name. -->
<configuration>
<property>
<name>mapred.capacity-scheduler.queue.default.capacity</name>
<value>100</value>
<description>Percentage of the number of slots in the cluster that are
to be available for jobs in this queue.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.queue.default.maximum-capacity</name>
<value>-1</value>
<description>
maximum-capacity defines a limit beyond which a queue cannot use the capacity of the cluster.
This provides a means to limit how much excess capacity a queue can use. By default, there is no limit.
The maximum-capacity of a queue can only be greater than or equal to its minimum capacity.
Default value of -1 implies a queue can use complete capacity of the cluster.
This property could be to curtail certain jobs which are long running in nature from occupying more than a
certain percentage of the cluster, which in the absence of pre-emption, could lead to capacity guarantees of
other queues being affected.
One important thing to note is that maximum-capacity is a percentage , so based on the cluster's capacity
the max capacity would change. So if large no of nodes or racks get added to the cluster , max Capacity in
absolute terms would increase accordingly.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.queue.default.supports-priority</name>
<value>false</value>
<description>If true, priorities of jobs will be taken into
account in scheduling decisions.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.queue.default.minimum-user-limit-percent</name>
<value>100</value>
<description> Each queue enforces a limit on the percentage of resources
allocated to a user at any given time, if there is competition for them.
This user limit can vary between a minimum and maximum value. The former
depends on the number of users who have submitted jobs, and the latter is
set to this property value. For example, suppose the value of this
property is 25. If two users have submitted jobs to a queue, no single
user can use more than 50% of the queue resources. If a third user submits
a job, no single user can use more than 33% of the queue resources. With 4
or more users, no user can use more than 25% of the queue's resources. A
value of 100 implies no user limits are imposed.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.queue.default.maximum-initialized-jobs-per-user</name>
<value>2</value>
<description>The maximum number of jobs to be pre-initialized for a user
of the job queue.
</description>
</property>
<!-- The default configuration settings for the capacity task scheduler -->
<!-- The default values would be applied to all the queues which don't have -->
<!-- the appropriate property for the particular queue -->
<property>
<name>mapred.capacity-scheduler.default-supports-priority</name>
<value>false</value>
<description>If true, priorities of jobs will be taken into
account in scheduling decisions by default in a job queue.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.default-minimum-user-limit-percent</name>
<value>100</value>
<description>The percentage of the resources limited to a particular user
for the job queue at any given point of time by default.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.default-maximum-initialized-jobs-per-user</name>
<value>2</value>
<description>The maximum number of jobs to be pre-initialized for a user
of the job queue.
</description>
</property>
<!-- Capacity scheduler Job Initialization configuration parameters -->
<property>
<name>mapred.capacity-scheduler.init-poll-interval</name>
<value>5000</value>
<description>The amount of time in miliseconds which is used to poll
the job queues for jobs to initialize.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.init-worker-threads</name>
<value>5</value>
<description>Number of worker threads which would be used by
Initialization poller to initialize jobs in a set of queue.
If number mentioned in property is equal to number of job queues
then a single thread would initialize jobs in a queue. If lesser
then a thread would get a set of queues assigned. If the number
is greater then number of threads would be equal to number of
job queues.
</description>
</property>
</configuration>

@ -0,0 +1,3 @@
<?xml version="1.0"?>
<allocations>
</allocations>

@ -0,0 +1,4 @@
---
- name: Restart mapreduce HA jobtracker
service: name=hadoop-0.20-mapreduce-jobtrackerha state=restarted sleep=20
ignore_errors: true

@ -0,0 +1,48 @@
---
- name: Install the mapred-site config file for the jobtracker HA
template: src=mapred-site-jobtracker.j2 dest=/etc/hadoop/conf/mapred-site.xml owner=root group=root mode=0444
notify:
- Restart mapreduce HA jobtracker
tags: [ 'hadoop', 'jobtracker', 'jt_conf' ]
- name: Install the mapred-queue-acls config
copy: src=mapred-queue-acls.xml dest=/etc/hadoop/conf/mapred-queue-acls.xml owner=root group=root mode=0444
notify:
- Restart mapreduce HA jobtracker
tags: [ 'hadoop', 'jobtracker' ]
- name: Install the mapreduce schedulers config
copy: src=mapreduce-{{ item }} dest=/etc/hadoop/conf/{{ item }} owner=root group=root mode=0444
with_items:
- fair-scheduler.xml
- capacity-scheduler.xml
tags: [ 'hadoop', 'jobtracker', 'jt_scheduler' ]
- name: Install the mapred_hosts_allow config
template: src=mapred_hosts_allow.txt.j2 dest=/etc/hadoop/conf/mapred_hosts_allow.txt owner=root group=root mode=0444
notify:
- Restart mapreduce HA jobtracker
tags: [ 'hadoop', 'jobtracker', 'hadoop_workers' ]
- name: Install the mapred_hosts_exclude config
copy: src=mapred_hosts_exclude.txt dest=/etc/hadoop/conf/mapred_hosts_exclude.txt owner=root group=root mode=0444
tags: [ 'hadoop', 'jobtracker', 'hadoop_workers', 'hadoop_exclude' ]
- name: Check if the hadoop-jmx conf directory exists
stat: path=/etc/hadoop-jmx/conf
register: check_jmx_confdir
tags: [ 'hadoop', 'jobtracker', 'jmx' ]
- name: Distribute the jmx authorization files for the jobtracker
template: src=templates/jmxremote.passwd.j2 dest=/etc/hadoop-jmx/conf/jmxremote.passwd owner=mapred mode=0600
when: check_jmx_confdir.stat.exists and hadoop_jmx_enabled
notify:
- Restart mapreduce HA jobtracker
tags: [ 'hadoop', 'jobtracker', 'jmx' ]
- name: Distribute the jmx role files for hadoop jobtracker
copy: src=files/jmxremote.access dest=/etc/hadoop-jmx/conf/jmxremote.access owner=root mode=0644
when: check_jmx_confdir.stat.exists and hadoop_jmx_enabled
notify:
- Restart mapreduce HA jobtracker
tags: [ 'hadoop', 'jobtracker', 'jmx' ]

@ -0,0 +1,219 @@
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>{{ jobtracker_cluster_id }}</value>
</property>
<property>
<name>mapred.jobtrackers.{{ jobtracker_cluster_id }}</name>
<value>{{ jobtracker_cluster_id_1 }},{{ jobtracker_cluster_id_2 }}</value>
<description>Comma-separated list of JobTracker IDs.</description>
</property>
<property>
<name>mapred.jobtracker.rpc-address.{{ jobtracker_cluster_id }}.{{ jobtracker_cluster_id_1 }}</name>
<!-- RPC address for {{ jobtracker_cluster_id_1 }} -->
<value>{{ jobtracker_node_1_hostname }}:{{ jobtracker_cluster_id1_rpc_port }}</value>
</property>
<property>
<name>mapred.jobtracker.rpc-address.{{ jobtracker_cluster_id }}.{{ jobtracker_cluster_id_2 }}</name>
<!-- RPC address for {{ jobtracker_cluster_id_2 }} -->
<value>{{ jobtracker_node_2_hostname }}:{{ jobtracker_cluster_id2_rpc_port }}</value>
</property>
<property>
<name>mapred.job.tracker.http.address.{{ jobtracker_cluster_id }}.{{ jobtracker_cluster_id_1 }}</name>
<!-- HTTP bind address for {{ jobtracker_cluster_id_1 }} -->
<value>0.0.0.0:{{ jobtracker_cluster_id1_http_port }}</value>
</property>
<property>
<name>mapred.job.tracker.http.address.{{ jobtracker_cluster_id }}.{{ jobtracker_cluster_id_2 }}</name>
<!-- HTTP bind address for {{ jobtracker_cluster_id_2 }} -->
<value>0.0.0.0:{{ jobtracker_cluster_id2_http_port }}</value>
</property>
<property>
<name>mapred.ha.jobtracker.rpc-address.{{ jobtracker_cluster_id }}.{{ jobtracker_cluster_id_1 }}</name>
<!-- RPC address for {{ jobtracker_cluster_id_1 }} HA daemon -->
<value>{{ jobtracker_node_1_hostname }}:{{ jobtracker_cluster_id1_ha_rpc_port }}</value>
</property>
<property>
<name>mapred.ha.jobtracker.rpc-address.{{ jobtracker_cluster_id }}.{{ jobtracker_cluster_id_2 }}</name>
<!-- RPC address for {{ jobtracker_cluster_id_2 }} HA daemon -->
<value>{{ jobtracker_node_2_hostname }}:{{ jobtracker_cluster_id2_ha_rpc_port }}</value>
</property>
<property>
<name>mapred.ha.jobtracker.http-redirect-address.{{ jobtracker_cluster_id }}.{{ jobtracker_cluster_id_1 }}</name>
<!-- HTTP redirect address for {{ jobtracker_cluster_id_1 }} -->
<value>{{ jobtracker_node_1_hostname }}:{{ jobtracker_cluster_id1_http_port }}</value>
</property>
<property>
<name>mapred.ha.jobtracker.http-redirect-address.{{ jobtracker_cluster_id }}.{{ jobtracker_cluster_id_2 }}</name>
<!-- HTTP redirect address for {{ jobtracker_cluster_id_2 }} -->
<value>{{ jobtracker_node_2_hostname }}:{{ jobtracker_cluster_id2_http_port }}</value>
</property>
<property>
<name>mapred.jobtracker.restart.recover</name>
<value>{{ jobtracker_restart_recover }}</value>
</property>
<property>
<name>mapred.client.failover.proxy.provider.{{ jobtracker_cluster_id }}</name>
<value>org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>mapred.client.failover.max.attempts</name>
<value>15</value>
</property>
<property>
<name>mapred.client.failover.sleep.base.millis</name>
<value>500</value>
</property>
<property>
<name>mapred.client.failover.sleep.max.millis</name>
<value>1500</value>
</property>
<property>
<name>mapred.client.failover.connection.retries</name>
<value>{{ jobtracker_failover_connect_retries }}</value>
</property>
<property>
<name>mapred.client.failover.connection.retries.on.timeouts</name>
<value>{{ jobtracker_failover_connect_retries }}</value>
</property>
<property>
<name>mapred.ha.fencing.methods</name>
<!-- We don't need a real fencing command (?) -->
<value>shell(/bin/true)</value>
</property>
<property>
<name>mapred.ha.automatic-failover.enabled</name>
<value>{{ jobtracker_auto_failover_enabled }}</value>
</property>
<property>
<name>mapred.ha.zkfc.port</name>
<value>{{ jobtracker_zkfc_port }}</value>
</property>
<property>
<name>mapred.hosts</name>
<value>/etc/hadoop/conf/mapred_hosts_allow.txt</value>
</property>
<property>
<name>mapred.hosts.exclude</name>
<value>/etc/hadoop/conf/mapred_hosts_exclude.txt</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/tmp/mapred/system</value>
</property>
<property>
<name>mapreduce.jobtracker.staging.root.dir</name>
<value>{{ mapred_staging_root_dir }}</value>
</property>
<property>
<name>mapred.acls.enabled</name>
<value>false</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/data/mapred/jt</value>
</property>
<property>
<name>hadoop.job.history.location</name>
<!-- <value>file:////var/log/hadoop-0.20-mapreduce/history</value> -->
<value>/jobtracker/history</value>
</property>
<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.FairScheduler</value>
</property>
{% if mapred_use_fair_scheduler %}
<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.FairScheduler</value>
</property>
<property>
<name>mapred.fairscheduler.allocation.file</name>
<value>{{ mapred_fair_scheduler_allocation_file }}</value>
</property>
{% if mapred_fair_scheduler_use_poolnameproperty %}
<property>
<name>mapred.fairscheduler.poolnameproperty</name>
<value>{{ mapred_fair_scheduler_poolnameproperty }}</value>
</property>
<property>
<name>mapred.fairscheduler.allow.undeclared.pools</name>
<value>{{ mapred_fair_scheduler_undecl_pools }}</value>
</property>
{% endif %}
<property>
<name>mapred.fairscheduler.weight.adjuster</name>
<value></value>
</property>
<property>
<name>mapred.fairscheduler.assignmultiple</name>
<value>{{ mapred_fair_scheduler_assignmultiple }}</value>
</property>
<property>
<name>mapred.fairscheduler.preemption</name>
<value>{{ mapred_fair_scheduler_preemption }}</value>
</property>
{% endif %}
<property>
<name>mapred.job.tracker.handler.count</name>
<value>{{ jobtracker_handler_count }}</value>
</property>
<property>
<name>mapred.reduce.slowstart.completed.maps</name>
<value>{{ mapred_reduce_slowstart_maps }}</value>
</property>
<property>
<name>mapreduce.jobtracker.split.metainfo.maxsize</name>
<value>{{ mapreduce_jt_split_metainfo_maxsize }}</value>
</property>
<property>
<name>mapred.user.jobconf.limit</name>
<value>{{ mapred_user_jobconf_limit }}</value>
</property>
<property>
<name>mapreduce.job.counters.max</name>
<value>{{ mapreduce_job_counters_max }}</value>
</property>
<property>
<name>mapred.jobtracker.retirejob.interval</name>
<value>{{ mapred_jt_retirejob_interval }}</value>
</property>
<property>
<name>mapred.job.tracker.persist.jobstatus.active</name>
<value>{{ jobtracker_persistent_jobstatus }}</value>
</property>
<property>
<name>mapred.job.tracker.persist.jobstatus.hours</name>
<value>{{ mapred_jt_persist_jobstatus_hours }}</value>
</property>
<property>
<name>mapred.job.tracker.persist.jobstatus.dir</name>
<value>/jobtracker/jobsInfo</value>
</property>
<property>
<name>mapred.jobtracker.completeuserjobs.maximum</name>
<value>{{ mapred_jt_completeuserjobs_max }}</value>
</property>
<property>
<name>mapred.job.restart.recover</name>
<value>{{ jobtracker_restart_recover }}</value>
</property>
<property>
<name>hadoop.rpc.socket.factory.class.JobSubmissionProtocol</name>
<value></value>
</property>
<property>
<name>mapred.jobtracker.plugins</name>
<value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value>
</property>
<property>
<name>mapred.queue.names</name>
<value>{{ mapred_queue_names }}</value>
</property>
<property>
<name>jobtracker.thrift.address</name>
<value>0.0.0.0:{{ jobtracker_http_port }}</value>
</property>
</configuration>

@ -0,0 +1,3 @@
{% for host in groups['mapred_tasktrackers'] %}
{{ host }}
{% endfor %}

@ -0,0 +1,33 @@
---
- name: Remove the non HA jobtracker package
apt: pkg={{ item }} state=absent
with_items:
- hadoop-0.20-mapreduce-jobtracker
tags:
- jobtracker
- mapred
- name: Install the HA jobtracker package
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- hadoop-0.20-mapreduce-jobtrackerha
tags:
- jobtracker
- mapred
- name: Install the hue-plugins package, needed to access the jobtracker from hue
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- hue-plugins
tags:
- jobtracker
- mapred
- name: Install the dsh package
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- dsh
tags:
- jobtracker
- mapred

@ -0,0 +1,9 @@
---
- name: Install the ZKFC jobtracker package
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- hadoop-0.20-mapreduce-zkfc
tags:
- jobtracker
- mapred
- jt_zkfc

@ -0,0 +1,17 @@
---
- name: Install the mapred tasktracker
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- hadoop-0.20-mapreduce
- hadoop-0.20-mapreduce-tasktracker
tags:
- tasktracker
- mapred
- name: Install non hadoop packages needed by some hadoop jobs
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
with_items:
- python-apsw
tags:
- tasktracker
- mapred

@ -0,0 +1,5 @@
---
- name: Restart mapreduce tasktracker
service: name=hadoop-0.20-mapreduce-tasktracker state=restarted sleep=20
ignore_errors: true

@ -0,0 +1,11 @@
---
- name: Install the mapred-site config file for the mapreduce tasktracker
template: src=templates/mapred-site-tasktracker.xml.j2 dest=/etc/hadoop/conf/mapred-site.xml owner=root group=root mode=0444
notify:
- Restart mapreduce tasktracker
tags:
- hadoop
- tasktracker
- mapred_conf

@ -0,0 +1,5 @@
export PATH="/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin:$PATH"
function restart_ntp() {
restart-ntp
}

@ -0,0 +1,3 @@
---
dependencies:
- role: '../../library/roles/nginx'

@ -0,0 +1,79 @@
---
- name: Create the configuration directory for dsh
file: path=/root/.dsh/group owner=root group=root state=directory
- name: Install the dsh host groups
template: src=dsh-{{ item }}.j2 dest=/root/.dsh/group/{{ item }}
with_items:
- quorum
- namenodes
- datanodes
- hbase-master
- jobtrackers
tags:
- system
# Install the global start/stop/restart scripts. jobtracker drives all the other nodes
- name: Install the Hadoop cluster start/stop scripts
template: src={{item }}.j2 dest=/usr/local/bin/{{ item }} owner=root group=root mode=555
with_items:
- service-hdfs-journalnode
- service-hdfs-zkfc
- service-hdfs-namenode
- service-hdfs-secondarynamenode
- service-hdfs-datanode
- service-hdfs-httpfs
- service-zookeeper-server
- service-hbase-master
- service-hbase-regionserver
- service-hbase-rest
- service-hbase-thrift
- service-mapreduce-jobtracker
- service-mapreduce-jobtracker-zkfc
- service-mapreduce-tasktracker
- service-global-hadoop-cluster
- service-global-hbase
- service-global-mapred
- service-global-hdfs
tags:
- system
- name: Install the shell functions library
copy: src={{ item }}.sh dest=/usr/local/lib/{{ item }} owner=root group=root mode=444
with_items:
- service-hadoop-common-functions
tags:
- system
- name: Another name for the zookeeper script
file: src=/usr/local/bin/service-zookeeper-server dest=/usr/local/bin/service-global-zookeeper state=link
tags:
- system
- name: update nginx config
template: src={{ portal_nginx_conf }}-nginx.conf.j2 dest=/etc/nginx/sites-available/{{ portal_nginx_conf }}
notify: Reload nginx
tags:
- portal
- name: symlink nginx config
file: src=/etc/nginx/sites-available/{{ portal_nginx_conf }} dest=/etc/nginx/sites-enabled/{{ portal_nginx_conf }} state=link
notify: Reload nginx
tags:
- portal
- name: Create the web root if it doesn''t exist
file: dest={{ portal_web_root }} state=directory
tags:
- portal
- name: Create a fake favicon
copy: content="" dest={{ portal_web_root }}/favicon.ico owner=root group=root mode=0444
tags:
- portal
- name: Install the index file
template: src=management-portal-index.html.j2 dest={{ portal_web_root }}/index.html mode=444
tags:
- portal

@ -0,0 +1,3 @@
{% for host in groups['hadoop_worker_nodes'] %}
{{ host }}
{% endfor %}

@ -0,0 +1,3 @@
{% for host in groups['hbase_masters'] %}
{{ host }}
{% endfor %}

@ -0,0 +1,3 @@
{% for host in groups['jt_masters'] %}
{{ host }}
{% endfor %}

@ -0,0 +1,3 @@
{% for host in groups['hdfs_masters'] %}
{{ host }}
{% endfor %}

@ -0,0 +1,3 @@
{% for host in groups['zookeeper_cluster'] %}
{{ host }}
{% endfor %}

@ -0,0 +1,49 @@
<html>
<head>
<title>
{{ portal_title }}
</title>
</head>
<body>
<ul> <h2>HUE interface</h2>
{% for host in groups['hue'] %}
<li><a href="http://{{ host }}:{{ hue_http_port }}/jobbrowser">HUE interface</a></li>
{% endfor %}
</ul>
<ul> <h2>Jobtracker</h2>
{% for host in groups['jt_masters'] %}
<li><a href="http://{{ jobtracker_node_1_hostname }}:{{ hostvars[host]['jt_http'] }}">{{ host }} Jobtracker master</a></li>
{% endfor %}
</ul>
<ul> <h2>Mapred tasktrackers</h2>
{% for host in groups['hadoop_worker_nodes'] %}
<li><a href="http://{{ jobtracker_node_1_hostname }}:{{ hostvars[host]['mapred_http'] }}">{{ host }} mapred tasktracker</a></li>
{% endfor %}
</ul>
<ul> <h2>HDFS namenode</h2>
{% for host in groups['hdfs_masters'] %}
<li><a href="http://{{ jobtracker_node_1_hostname }}:{{ hostvars[host]['hdfs_m_http'] }}">{{ host }} HDFS namenode</a></li>
{% endfor %}
</ul>
<ul> <h2>HDFS datanodes</h2>
{% for host in groups['hadoop_worker_nodes'] %}
<li><a href="http://{{ jobtracker_node_1_hostname }}:{{ hostvars[host]['hdfs_http'] }}">{{ host }} HDFS datanode</a></li>
{% endfor %}
</ul>
<ul> <h2>HBASE master</h2>
{% for host in groups['hbase_masters'] %}
<li><a href="http://{{ jobtracker_node_1_hostname }}:{{ hostvars[host]['hbase_m_http'] }}">{{ host }} HBASE master</a></li>
{% endfor %}
</ul>
<ul> <h2>HBASE regionservers</h2>
{% for host in groups['hadoop_worker_nodes'] %}
<li><a href="http://{{ jobtracker_node_1_hostname }}:{{ hostvars[host]['hbase_http'] }}">{{ host }} HBASE regionserver</a></li>
{% endfor %}
</ul>
<ul> <h2>Logstash collector</h2>
{% for host in groups['logstash'] %}
<li><a href="http://{{ host }}/">{{ host }} Logstash collector</a></li>
{% endfor %}
</ul>
</body>
</html>

@ -0,0 +1,279 @@
server {
root {{ portal_web_root }};
index index.html;
gzip on;
gzip_disable "MSIE [1-6]\.(?!.*SV1)";
gzip_types text/javascript text/css application/x-javascript application/javascript application/json image/svg+xml;
gzip_vary on;
gzip_proxied any;
server_name {{ ansible_fqdn }};
location / {
auth_pam "NeMIS Hadoop Cluster Access";
auth_pam_service_name "{{ portal_pam_svc_name }}";
}
# HUE is a real mess
{% if hue_servers is defined %}
{% for host in groups['hue_servers'] %}
location /jobbrowser {
proxy_pass http://{{ host }}:{{ hue_http_port }}/jobbrowser;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Access";
# auth_pam_service_name "nginx";
}
location /accounts {
proxy_pass http://{{ host }}:{{ hue_http_port }}/accounts;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Access";
# auth_pam_service_name "{{ portal_pam_svc_name }}";
}
location /beeswax {
proxy_pass http://{{ host }}:{{ hue_http_port }}/beeswax;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Access";
# auth_pam_service_name "{{ portal_pam_svc_name }}";
}
location /oozie {
proxy_pass http://{{ host }}:{{ hue_http_port }}/oozie;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Access";
# auth_pam_service_name "{{ portal_pam_svc_name }}";
}
location /help {
proxy_pass http://{{ host }}:{{ hue_http_port }}/help;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Access";
# auth_pam_service_name "{{ portal_pam_svc_name }}";
}
location /static {
proxy_pass http://{{ host }}:{{ hue_http_port }}/static;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Access";
# auth_pam_service_name "{{ portal_pam_svc_name }}";
}
location /jobsub {
proxy_pass http://{{ host }}:{{ hue_http_port }}/jobsub;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Access";
# auth_pam_service_name "{{ portal_pam_svc_name }}";
}
location /shell {
proxy_pass http://{{ host }}:{{ hue_http_port }}/shell;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Access";
# auth_pam_service_name "{{ portal_pam_svc_name }}";
}
location /useradmin {
proxy_pass http://{{ host }}:{{ hue_http_port }}/useradmin;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Access";
# auth_pam_service_name "{{ portal_pam_svc_name }}";
}
location /filebrowser {
proxy_pass http://{{ host }}:{{ hue_http_port }}/filebrowser;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Access";
# auth_pam_service_name "{{ portal_pam_svc_name }}";
}
{% endfor %}
{% endif %}
}
# HUE
{% for host in groups['hue'] %}
server {
listen {{ hostvars[host]['hue_http'] }};
location / {
proxy_pass http://{{ host }}:{{ hue_http_port }};
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
auth_pam "NeMIS Hadoop Cluster Access";
auth_pam_service_name "{{ portal_pam_svc_name }}";
}
}
{% endfor %}
# Jobtracker HA masters
{% for host in groups['jt_masters'] %}
server {
listen {{ hostvars[host]['jt_http'] }};
location / {
proxy_pass http://{{ host }}:{{ jobtracker_cluster_id1_http_port }}/;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
auth_pam "NeMIS Hadoop Cluster Access";
auth_pam_service_name "{{ portal_pam_svc_name }}";
}
}
{% endfor %}
# Map/Reduce tasktrackers
{% for host in groups['hadoop_worker_nodes'] %}
server {
listen {{ hostvars[host]['mapred_http'] }};
location / {
proxy_pass http://{{ host }}:{{ mapred_tasktracker_http_port }}/;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
auth_pam "NeMIS Hadoop Cluster Access";
auth_pam_service_name "{{ portal_pam_svc_name }}";
}
}
{% endfor %}
# HDFS masters
{% for host in groups['hdfs_masters'] %}
server {
listen {{ hostvars[host]['hdfs_m_http'] }};
location / {
proxy_pass http://{{ host }}:{{ hdfs_nn_http_port }}/;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
auth_pam "NeMIS Hadoop Cluster Access";
auth_pam_service_name "{{ portal_pam_svc_name }}";
}
}
{% endfor %}
# HDFS datanodes
{% for host in groups['hadoop_worker_nodes'] %}
server {
listen {{ hostvars[host]['hdfs_http'] }};
location / {
proxy_pass http://{{ host }}:{{ hdfs_datanode_http_port }}/;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
auth_pam "NeMIS Hadoop Cluster Access";
auth_pam_service_name "{{ portal_pam_svc_name }}";
}
}
{% endfor %}
# HBASE masters
{% for host in groups['hbase_masters'] %}
server {
listen {{ hostvars[host]['hbase_m_http'] }};
location / {
proxy_pass http://{{ host }}:{{ hbase_master_http_port }}/;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
auth_pam "NeMIS Hadoop Cluster Access";
auth_pam_service_name "{{ portal_pam_svc_name }}";
}
}
{% endfor %}
# HBASE regionservers
{% for host in groups['hadoop_worker_nodes'] %}
server {
listen {{ hostvars[host]['hbase_http'] }};
location / {
proxy_pass http://{{ host }}:{{ hbase_regionserver_http_port }}/;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_redirect off;
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
auth_pam "NeMIS Hadoop Cluster Access";
auth_pam_service_name "{{ portal_pam_svc_name }}";
}
}
{% endfor %}
# Logstash
# -- NB: that doesn't work, kibana keeps searching the elasticsearch instance as jobtracker.t.hadoop.
#{% for host in groups['logstash'] %}
#server {
# listen {{ hostvars[host]['log_http'] }};
# location / {
# proxy_pass http://{{ host }}/;
# proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
# proxy_redirect off;
# proxy_buffering off;
# proxy_set_header Host $host;
# proxy_set_header X-Real-IP $remote_addr;
# proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# auth_pam "NeMIS Hadoop Cluster Logstash data";
# auth_pam_service_name "{{ portal_pam_svc_name }}";
# }
#}
#{% endfor %}

@ -0,0 +1,53 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
# Correct start order (reverse to obtain the stop order):
#
# • HDFS
# • MapReduce
# • Zookeeper
# • HBase
# (• Hive Metastore )
# (• Hue )
# (• Oozie)
# • Ganglia
# • Nagios
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
SERVICES_START_ORDER="service-global-zookeeper service-global-hdfs service-global-mapred service-global-hbase"
SERVICES_STOP_ORDER="service-global-hbase service-global-mapred service-global-hdfs service-global-zookeeper"
SH_LIB_PATH=/usr/local/lib
if [ -f $SH_LIB_PATH/service-hadoop-common-functions ] ; then
. $SH_LIB_PATH/service-hadoop-common-functions
else
echo "Library file: $SH_LIB_PATH/service-hadoop-common-functions is missing"
exit 1
fi
SERVICES=$SERVICES_START_ORDER
ARG=$1
function action_loop(){
ACTION=$ARG
if [ "$ACTION" == "stop" ] ; then
SERVICES=$SERVICES_STOP_ORDER
fi
for SRV in $SERVICES ; do
$SRV $ACTION
done
}
case "$ARG" in
start|restart|reload|force-reload|status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,38 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
# Correct start order (reverse to obtain the stop order):
#
# • HBase master
# • HBase regionservers
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
SERVICES_START_ORDER="service-hbase-master service-hbase-regionserver service-hbase-thrift"
SERVICES_STOP_ORDER="service-hbase-thrift service-hbase-regionserver service-hbase-master"
SERVICES=$SERVICES_START_ORDER
ARG=$1
function action_loop(){
ACTION=$ARG
if [ "$ACTION" == "stop" ] ; then
SERVICES=$SERVICES_STOP_ORDER
fi
for SRV in $SERVICES ; do
$SRV $ACTION
done
}
case "$ARG" in
start|restart|reload|force-reload|status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,40 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
# Correct start order (reverse to obtain the stop order):
#
# • HDFS namenode 1
# • HDFS namenode 2
# • HDFS journalnodes
# • HDFS datanodes
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
SERVICES_START_ORDER="service-hdfs-journalnode service-hdfs-zkfc service-hdfs-namenode service-hdfs-datanode"
SERVICES_STOP_ORDER="service-hdfs-datanode service-hdfs-namenode service-hdfs-journalnode service-hdfs-zkfc"
SERVICES=$SERVICES_START_ORDER
ARG=$1
function action_loop(){
ACTION=$ARG
if [ "$ACTION" == "stop" ] ; then
SERVICES=$SERVICES_STOP_ORDER
fi
for SRV in $SERVICES ; do
$SRV $ACTION
done
}
case "$ARG" in
start|restart|reload|force-reload|status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,38 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
# Correct start order (reverse to obtain the stop order):
#
# • MapReduce jobtracker
# • MapReduce tasktrackers
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
SERVICES_START_ORDER="service-mapreduce-jobtracker-zkfc service-mapreduce-jobtracker service-mapreduce-tasktracker"
SERVICES_STOP_ORDER="service-mapreduce-tasktracker service-mapreduce-jobtracker-zkfc service-mapreduce-jobtracker"
SERVICES=$SERVICES_START_ORDER
ARG=$1
function action_loop(){
ACTION=$ARG
if [ "$ACTION" == "stop" ] ; then
SERVICES=$SERVICES_STOP_ORDER
fi
for SRV in $SERVICES ; do
$SRV $ACTION
done
}
case "$ARG" in
start|restart|reload|force-reload|status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,37 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
# Correct start order (reverse to obtain the stop order):
#
# • Zookeeper server
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
SERVICES_START_ORDER="service-zookeeper-server"
SERVICES_STOP_ORDER=$SERVICES_START_ORDER
SERVICES=$SERVICES_START_ORDER
ARG=$1
function action_loop(){
ACTION=$ARG
if [ "$ACTION" == "stop" ] ; then
SERVICES=$SERVICES_STOP_ORDER
fi
for SRV in $SERVICES ; do
$SRV $ACTION
done
}
case "$ARG" in
start|restart|reload|force-reload|status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,32 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
HOSTNAME=$( hostname -f )
ARG=$1
SERVICE_CMD=/usr/sbin/service
STARTUP_SCRIPT=hbase-master
REMOTE_CMD=dsh
DSH_GROUPNAME=hbase-master
function action_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- ${SERVICE_CMD} ${STARTUP_SCRIPT} ${ACTION}
}
case "$ARG" in
start|restart|reload|force-reload)
action_loop
;;
status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,38 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
ARG=$1
SERVICE_CMD=/usr/sbin/service
STARTUP_SCRIPT=hbase-regionserver
REMOTE_CMD=dsh
DSH_GROUPNAME=datanodes
function ntp_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- restart-ntp
}
function action_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- ${SERVICE_CMD} ${STARTUP_SCRIPT} ${ACTION}
}
case "$ARG" in
start|restart|reload|force-reload)
# ntp_loop
action_loop
;;
status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,38 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
ARG=$1
SERVICE_CMD=/usr/sbin/service
STARTUP_SCRIPT=hbase-rest
REMOTE_CMD=dsh
DSH_GROUPNAME=datanodes
function ntp_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- restart-ntp
}
function action_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- ${SERVICE_CMD} ${STARTUP_SCRIPT} ${ACTION}
}
case "$ARG" in
start|restart|reload|force-reload)
# ntp_loop
action_loop
;;
status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,33 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
export PATH="/sbin:/usr/sbin:$PATH"
ARG=$1
SERVICE_CMD=/usr/sbin/service
STARTUP_SCRIPT=hbase-thrift
REMOTE_CMD=dsh
DSH_GROUPNAME=hbase-master
function action_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- ${SERVICE_CMD} ${STARTUP_SCRIPT} ${ACTION}
}
case "$ARG" in
start|restart|reload|force-reload)
action_loop
;;
status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,38 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
ARG=$1
SERVICE_CMD=/usr/sbin/service
STARTUP_SCRIPT=hadoop-hdfs-datanode
REMOTE_CMD=dsh
DSH_GROUPNAME=datanodes
function ntp_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- restart-ntp
}
function action_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- ${SERVICE_CMD} ${STARTUP_SCRIPT} ${ACTION}
}
case "$ARG" in
start|restart|reload|force-reload)
# ntp_loop
action_loop
;;
status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,40 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
export PATH="/sbin:/usr/sbin:$PATH"
ARG=$1
NAMENODE={{ secondary_nm_hostname }}
SERVICE_SCRIPT=service
STARTUP_SCRIPT=hadoop-httpfs
REMOTE_CMD=ssh
case "$ARG" in
start)
# $REMOTE_CMD $NAMENODE restart-ntp
echo "Running $STARTUP_SCRIPT $ARG on host $NAMENODE"
$REMOTE_CMD $NAMENODE $SERVICE_SCRIPT $STARTUP_SCRIPT start
;;
restart|reload|force-reload)
# $REMOTE_CMD $NAMENODE restart-ntp
echo "Running $STARTUP_SCRIPT $ARG on host $NAMENODE"
$REMOTE_CMD $NAMENODE $SERVICE_SCRIPT $STARTUP_SCRIPT restart
;;
status)
echo "Running $STARTUP_SCRIPT $ARG on host $NAMENODE"
$REMOTE_CMD $NAMENODE $SERVICE_SCRIPT $STARTUP_SCRIPT status
;;
stop)
echo "Running $STARTUP_SCRIPT $ARG on host $NAMENODE"
$REMOTE_CMD $NAMENODE $SERVICE_SCRIPT $STARTUP_SCRIPT stop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,39 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
HOSTNAME=$( hostname -f )
DOMAIN_N="t.hadoop.research-infrastructures.eu"
ARG=$1
SERVICE_CMD=/usr/sbin/service
STARTUP_SCRIPT=hadoop-hdfs-journalnode
REMOTE_CMD=dsh
DSH_GROUPNAME=quorum
function ntp_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- restart-ntp
}
function action_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- ${SERVICE_CMD} ${STARTUP_SCRIPT} ${ACTION}
}
case "$ARG" in
start|restart|reload|force-reload)
# ntp_loop
action_loop
;;
status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

@ -0,0 +1,39 @@
#!/bin/bash
#
# We use the jobtracker as provisioning server
#
HOSTNAME=$( hostname -f )
export PATH="/sbin:/usr/sbin:$PATH"
ARG=$1
SERVICE_CMD=/usr/sbin/service
STARTUP_SCRIPT=hadoop-hdfs-namenode
REMOTE_CMD=dsh
DSH_GROUPNAME=namenodes
function ntp_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- restart-ntp
}
function action_loop(){
ACTION=$ARG
dsh -g ${DSH_GROUPNAME} -cM -- ${SERVICE_CMD} ${STARTUP_SCRIPT} ${ACTION}
}
case "$ARG" in
start|restart|reload|force-reload)
# ntp_loop
action_loop
;;
status|stop)
action_loop
;;
*)
echo "Usage: $0 start|stop|restart|status" >&2
exit 3
;;
esac
exit 0

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save