Compare commits
1 Commits
main
...
hadoop-cdh
Author | SHA1 | Date |
---|---|---|
|
b821ee815b | 6 months ago |
@ -0,0 +1,86 @@
|
||||
2013-11-06 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* templates/hadoop-nrpe.cfg.j2: Add the correct entry for the tasktracker event handler.
|
||||
|
||||
2013-11-04 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* templates/iptables-rules.v4.j2: Add access to the hbase master rpc port from the isti network.
|
||||
|
||||
2013-10-10 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* templates/hbase-site.j2: Add the property hbase.master.loadbalance.bytable to try a better balance for our workload. Reference here: http://answers.mapr.com/questions/7049/table-only-on-single-region-server
|
||||
|
||||
2013-10-09 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* templates/nagios-server/hadoop-cluster/services.cfg.j2: Handler to restart the tasktracker when it fails.
|
||||
* templates/iptables-rules.*.j2: iptables rules to block access to the services ports from outside CNR.
|
||||
|
||||
2013-10-07 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* templates/nagios-server: added checks for the logstash host and services
|
||||
* logstash.yml: Add logstash with remote syslog to aggregate all the workers logs. Needed for solr.
|
||||
|
||||
2013-10-01 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* templates/: management portal that redirects to the services web interfaces.
|
||||
|
||||
2013-09-23 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* tasks/jobtracker-ha.yml: HA configuration for the jobtracker. jobtracker.t.hadoop and quorum4.t.hadoop are the two masters.
|
||||
|
||||
2013-09-19 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* all.yml: HDFS is now HA. All the datanodes lists are generated from the hosts file and are not static anymore. Changed nagios to reflect the new configuration.
|
||||
|
||||
2013-09-17 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* templates: Changed the system-* scripts to manage the second namenode instance. Removed the secondary namenode start/stop script
|
||||
|
||||
2013-09-12 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* hadoop-test.yml: New quorum4.t.hadoop node. Zookeeper now has 5
|
||||
quorum nodes. HBASE master HA. quorum4 is the other instance.
|
||||
|
||||
2013-07-29 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* templates/datanode-hdfs-site.j2: Added "support_append" as "true" and max_xcievers as 1024
|
||||
|
||||
2013-06-20 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* hadoop-ganglia.yml: The ganglia configuration is now differentiated between datanodes, jobtracker, hbase master, hdfs namenode, hdfs secondary namenode
|
||||
|
||||
2013-02-27 Andrea Dell'Amico <adellam@sevenseas.org>
|
||||
|
||||
* vars/hadoop-global-vars.yml: mapred_tasktracker_reduce_tasks_maximum: 5, hbase_regionserver_heap_size: 3192
|
||||
|
||||
2013-02-22 Andrea Dell'Amico <adellam@isti.cnr.it>
|
||||
|
||||
* init.sh: Create hdfs directory /jobtracker to store the jobtracker history
|
||||
* templates/mapred-site-jobtracker.j2: Activate permanent jobtracker history
|
||||
* jobtracker.yml: Cleanup
|
||||
|
||||
2013-02-18 Andrea Dell'Amico <adellam@isti.cnr.it>
|
||||
|
||||
* vars/hadoop-global-vars.yml: mapred_child_java_opts: "-Xmx3092M", mapred_map_child_java_opts: "-Xmx2048M", mapred_reduce_child_java_opts: "-Xmx1512M", hbase_regionserver_heap_size: 4092
|
||||
|
||||
2013-02-18 Andrea Dell'Amico <adellam@isti.cnr.it>
|
||||
|
||||
* vars/hadoop-global-vars.yml: hbase_master_heap_size: 5120, hbase_regionserver_heap_size: 3192
|
||||
|
||||
2013-02-18 Andrea Dell'Amico <adellam@isti.cnr.it>
|
||||
|
||||
* vars/hadoop-global-vars.yml (hbase_thrift_heap_size): mapred_child_java_opts: "-Xmx1512M", mapred_map_child_java_opts: "-Xmx3092M", mapred_reduce_child_java_opts: "-Xmx2048M", hbase_master_heap_size: 3072
|
||||
|
||||
2013-02-16 Andrea Dell'Amico <adellam@isti.cnr.it>
|
||||
|
||||
* templates/hbase-thrift-env.sh.j2: Disabled the jmx console for hbase thrift
|
||||
|
||||
* templates/hbase-master-env.sh.j2: disabled the master jmx console
|
||||
|
||||
* vars/hadoop-global-vars.yml: zookeeper_max_timeout: 240000, fixed the zookeeper quorum host naming
|
||||
|
||||
2013-02-16 Andrea Dell'Amico <adellam@isti.cnr.it>
|
||||
|
||||
* vars/hadoop-global-vars.yml: mapred_child_java_opts: "-Xmx2G", mapred_reduce_child_java_opts: "-Xmx2512M", hbase_regionserver_heap_size: 5000
|
||||
|
||||
|
@ -1,3 +1,303 @@
|
||||
# hadoop-ansible
|
||||
|
||||
Ansible playbook that installs and configures a Hadoop cluster.
|
||||
# Hadoop cluster based on the CDH 4 packages.
|
||||
|
||||
This is the playbook that I used to install and configure the Hadoop cluster @CNR, based on the deb packages found in the Cloudera repositories.
|
||||
No cloudera manager was used nor installed.
|
||||
|
||||
## The cluster.
|
||||
|
||||
The cluster structure is the following:
|
||||
|
||||
- jobtracker.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
|
||||
- mapreduce HA jobtracker
|
||||
- zookeeper quorum
|
||||
- HA HDFS journal
|
||||
- quorum4.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
|
||||
- mapreduce HA jobtracker
|
||||
- zookeeper quorum
|
||||
- HA HDFS journal
|
||||
- nn1.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
|
||||
- hdfs HA namenode
|
||||
- zookeeper quorum
|
||||
- HA HDFS journal
|
||||
- nn2.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
|
||||
- hdfs HA namenode
|
||||
- zookeeper quorum
|
||||
- HA HDFS journal
|
||||
- hbase-master.t.hadoop.research-infrastructures.eu (3.5GB RAM, 4 CPUs):
|
||||
- hbase primary master
|
||||
- hbase thrift
|
||||
- zookeeper quorum
|
||||
- HA HDFS journal
|
||||
- hbase-master2.t.hadoop.research-infrastructures.eu (2GB RAM, 2 CPUs):
|
||||
- HBASE secondary master
|
||||
- hbase thrift
|
||||
|
||||
- node{2..13}.t.hadoop.research-infrastructures.eu (9GB RAM, 8 CPUs, 1000GB external storage for HDFS each):
|
||||
- mapreduce tasktracker
|
||||
- hdfs datanode
|
||||
- hbase regionserver
|
||||
- solr (sharded)
|
||||
|
||||
- hive.t.hadoop.research-infrastructures.eu:
|
||||
- hue
|
||||
- hive
|
||||
- oozie
|
||||
- sqoop
|
||||
|
||||
- db.t.hadoop.research-infrastructures.eu:
|
||||
- postgresql instance for hue and hive
|
||||
|
||||
|
||||
Su jobtracker.t.hadoop.research-infrastructures.eu sono installati gli
|
||||
script che gestiscono tutti i servizi. È possibile fermare/attivare i
|
||||
singoli servizi oppure tutto il cluster, rispettando l'ordine
|
||||
corretto.
|
||||
|
||||
Hanno tutti prefisso "service-" e il nome dello script dà un'idea delle operazioni che verranno eseguite:
|
||||
service-global-hadoop-cluster
|
||||
service-global-hbase
|
||||
service-global-hdfs
|
||||
service-global-mapred
|
||||
service-global-zookeeper
|
||||
service-hbase-master
|
||||
service-hbase-regionserver
|
||||
service-hbase-rest
|
||||
service-hdfs-datanode
|
||||
service-hdfs-httpfs
|
||||
service-hdfs-journalnode
|
||||
service-hdfs-namenode
|
||||
service-hdfs-secondarynamenode
|
||||
service-mapreduce-jobtracker
|
||||
service-mapreduce-tasktracker
|
||||
service-zookeeper-server
|
||||
|
||||
Prendono come parametro "start,stop,status,restart"
|
||||
|
||||
|
||||
- jobtracker URL:
|
||||
http://jobtracker.t.hadoop.research-infrastructures.eu:50030/jobtracker.jsp
|
||||
- HDFS URL:
|
||||
http://namenode.t.hadoop.research-infrastructures.eu:50070/dfshealth.jsp
|
||||
- HBASE master URL:
|
||||
http://hbase-master.t.hadoop.research-infrastructures.eu:60010/master-status
|
||||
- HUE Web Interface:
|
||||
http://quorum2.t.hadoop.research-infrastructures.eu:8888
|
||||
|
||||
- URL ganglia, per le metriche del cluster:
|
||||
http://monitoring.research-infrastructures.eu/ganglia/?r=hour&cs=&ce=&s=by+name&c=Openaire%252B%2520Hadoop%2520TEST&tab=m&vn=
|
||||
- URL Nagios, per lo stato dei servizi (da attivare):
|
||||
http://monitoring.research-infrastructures.eu/nagios3/
|
||||
------------------------------------------------------------------------------------------------
|
||||
dom0/nodes/san map data
|
||||
|
||||
dlib18x: *node8* e90.6 (dlibsan9)
|
||||
dlib19x: *node9* e90.7 (dlibsan9)
|
||||
dlib20x: *node10* e90.8 (dlibsan9)
|
||||
dlib22x: *node11* e90.5 (dlibsan9)
|
||||
*node7* e63.4 (dlibsan6)
|
||||
dlib23x: *node12* e80.3 (dlibsan8)
|
||||
*node13* e80.4 (dlibsan8)
|
||||
dlib24x: *node2* e25.1 (dlibsan2)
|
||||
*node3* e74.1 (dlibsan7)
|
||||
dlib25x: *node4* e83.4 (dlibsan8)
|
||||
dlib26x: *node5* e72.1 (dlibsan7)
|
||||
*node6* e63.3 (dlibsan6)
|
||||
|
||||
------------------------------------------------------------------------------------------------
|
||||
Submitting a job (supporting multiple users)
|
||||
To support multiple users you create UNIX user accounts only in the master node.
|
||||
|
||||
Sul namenode:
|
||||
|
||||
#groupadd supergroup
|
||||
(da eseguire una sola volta)
|
||||
|
||||
#adduser claudio
|
||||
...
|
||||
|
||||
# su - hdfs
|
||||
$ hadoop dfs -mkdir /home/claudio
|
||||
$ hadoop dfs -chown -R claudio:supergroup /home/claudio
|
||||
|
||||
(aggiungere claudio al gruppo supergroup)
|
||||
|
||||
|
||||
Important:
|
||||
|
||||
If you do not create /tmp properly, with the right permissions as shown below, you may have problems with CDH components later. Specifically, if you don't create /tmp yourself, another process may create it automatically with restrictive permissions that will prevent your other applications from using it.
|
||||
|
||||
Create the /tmp directory after HDFS is up and running, and set its permissions to 1777 (drwxrwxrwt), as follows:
|
||||
|
||||
$ sudo -u hdfs hadoop fs -mkdir /tmp
|
||||
$ sudo -u hdfs hadoop fs -chmod -R 1777 /tmp
|
||||
|
||||
Note:
|
||||
|
||||
If Kerberos is enabled, do not use commands in the form sudo -u <user> <command>; they will fail with a security error. Instead, use the following commands: $ kinit <user> (if you are using a password) or $ kinit -kt <keytab> <principal> (if you are using a keytab) and then, for each command executed by this user, $ <command>
|
||||
Step 8: Create MapReduce /var directories
|
||||
|
||||
sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
|
||||
sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
|
||||
sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred
|
||||
|
||||
Step 9: Verify the HDFS File Structure
|
||||
|
||||
$ sudo -u hdfs hadoop fs -ls -R /
|
||||
|
||||
You should see:
|
||||
|
||||
drwxrwxrwt - hdfs supergroup 0 2012-04-19 15:14 /tmp
|
||||
drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var
|
||||
drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var/lib
|
||||
drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var/lib/hadoop-hdfs
|
||||
drwxr-xr-x - hdfs supergroup 0 2012-04-19 15:16 /var/lib/hadoop-hdfs/cache
|
||||
drwxr-xr-x - mapred supergroup 0 2012-04-19 15:19 /var/lib/hadoop-hdfs/cache/mapred
|
||||
drwxr-xr-x - mapred supergroup 0 2012-04-19 15:29 /var/lib/hadoop-hdfs/cache/mapred/mapred
|
||||
drwxrwxrwt - mapred supergroup 0 2012-04-19 15:33 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
|
||||
|
||||
Step 10: Create and Configure the mapred.system.dir Directory in HDFS
|
||||
|
||||
After you start HDFS and create /tmp, but before you start the JobTracker (see the next step), you must also create the HDFS directory specified by the mapred.system.dir parameter (by default ${hadoop.tmp.dir}/mapred/system and configure it to be owned by the mapred user.
|
||||
|
||||
To create the directory in its default location:
|
||||
|
||||
$ sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system
|
||||
$ sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/mapred/system
|
||||
|
||||
Important:
|
||||
|
||||
If you create the mapred.system.dir directory in a different location, specify that path in the conf/mapred-site.xml file.
|
||||
|
||||
When starting up, MapReduce sets the permissions for the mapred.system.dir directory to drwx------, assuming the user mapred owns that directory.
|
||||
Step 11: Start MapReduce
|
||||
|
||||
To start MapReduce, start the TaskTracker and JobTracker services
|
||||
|
||||
On each TaskTracker system:
|
||||
|
||||
$ sudo service hadoop-0.20-mapreduce-tasktracker start
|
||||
|
||||
On the JobTracker system:
|
||||
|
||||
$ sudo service hadoop-0.20-mapreduce-jobtracker start
|
||||
|
||||
Step 12: Create a Home Directory for each MapReduce User
|
||||
|
||||
Create a home directory for each MapReduce user. It is best to do this on the NameNode; for example:
|
||||
|
||||
$ sudo -u hdfs hadoop fs -mkdir /user/<user>
|
||||
$ sudo -u hdfs hadoop fs -chown <user> /user/<user>
|
||||
|
||||
where <user> is the Linux username of each user.
|
||||
|
||||
Alternatively, you can log in as each Linux user (or write a script to do so) and create the home directory as follows:
|
||||
|
||||
sudo -u hdfs hadoop fs -mkdir /user/$USER
|
||||
sudo -u hdfs hadoop fs -chown $USER /user/$USER
|
||||
|
||||
|
||||
|
||||
------------------------------------------------------------------------------------------------
|
||||
We use the jobtracker as provisioning server
|
||||
Correct start order (reverse to obtain the stop order):
|
||||
• HDFS (NB: substitute secondarynamenode with journalnode when we will have HA)
|
||||
• MapReduce
|
||||
• Zookeeper
|
||||
• HBase
|
||||
• Hive Metastore
|
||||
• Hue
|
||||
• Oozie
|
||||
• Ganglia
|
||||
• Nagios
|
||||
|
||||
|
||||
I comandi di init si trovano nel file "init.sh" nel repository ansible.
|
||||
|
||||
|
||||
Errore da indagare:
|
||||
http://stackoverflow.com/questions/6153560/hbase-client-connectionloss-for-hbase-error
|
||||
|
||||
# GC hints
|
||||
http://stackoverflow.com/questions/9792590/gc-tuning-preventing-a-full-gc?rq=1
|
||||
|
||||
HBASE troubleshooting
|
||||
|
||||
- Se alcune region rimangono in "transition" indefinitamente, è possibile provare a risolvere il problema da shell:
|
||||
|
||||
# su - hbase
|
||||
$ hbase hbck -fixAssignments
|
||||
|
||||
Potrebbe essere utile anche
|
||||
$ hbase hbck -repairHoles
|
||||
|
||||
-----------------------------------------------------
|
||||
Quando si verifica: "ROOT stuck in assigning forever"
|
||||
|
||||
bisogna:
|
||||
- verificare che non ci siano errori relativi a zookeeper. Se ci sono, far ripartire zookeeper e poi tutto il cluster hbase
|
||||
- Far ripartire il solo hbase master
|
||||
-----------------------------------------------------
|
||||
Quando ci sono tabelle disabilitate, ma che risultano impossibili da abilitare o eliminare:
|
||||
# su - hbase
|
||||
$ hbase hbck -fixAssignments
|
||||
|
||||
* Restart del master hbase
|
||||
|
||||
-----------------------------------------------------
|
||||
Vedi: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/32838
|
||||
Ed in generale, per capire il funzionamento: http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
|
||||
|
||||
Tool per il monitoraggio di hbase quando è configurato per il manual splitting:
|
||||
https://github.com/sentric/hannibal
|
||||
|
||||
---------------------------------------------------------------------------------
|
||||
|
||||
2013-02-22 10:24:46,492 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@41a7fead
|
||||
2013-02-22 10:24:46,492 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_node2.t.hadoop.research-infrastructures.eu:localhost/127.0.0.1:47798
|
||||
2013-02-22 10:24:46,492 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1 and reserved physical memory is not configured. TaskMemoryManager is disabled.
|
||||
2013-02-22 10:24:46,571 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
|
||||
|
||||
---
|
||||
Post interessante che tratta la configurazione ed i vari parametri: http://gbif.blogspot.it/2011/01/setting-up-hadoop-cluster-part-1-manual.html
|
||||
|
||||
|
||||
Lista di nomi di parametri deprecati e il loro nuovo nome: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
|
||||
|
||||
---
|
||||
How to decommission a worker node
|
||||
|
||||
1. If they are many, reduce the hdfs redundancy factor
|
||||
2. Stop the regionserver on the node
|
||||
3. Add the node to the hdfs and jobtracker exclude list
|
||||
|
||||
./run.sh mapred.yml -i inventory/hosts.production -l jt_masters --tags=hadoop_workers
|
||||
./run.sh hadoop-hdfs.yml -i inventory/hosts.production -l hdfs_masters --tags=hadoop_workers
|
||||
|
||||
4. Refresh the hdfs and jobtracker configuration
|
||||
|
||||
hdfs dfsadmin -refreshNodes
|
||||
mapred mradmin -refreshNodes
|
||||
|
||||
5. Remove the node from the list of allowed ones
|
||||
|
||||
5a. Edit the inventory
|
||||
|
||||
5b. Run
|
||||
./run.sh hadoop-common.yml -i inventory/hosts.production --tags=hadoop_workers
|
||||
./run.sh mapred.yml -i inventory/hosts.production -l jt_masters --tags=hadoop_workers
|
||||
./run.sh hadoop-hdfs.yml -i inventory/hosts.production -l hdfs_masters --tags=hadoop_workers
|
||||
|
||||
---------------------------------------------------------------------------------
|
||||
Nagios monitoring
|
||||
|
||||
- The handlers to restart the services are managed via nrpe. To get them work, we need to:
|
||||
- Add an entry in nrpe.cfg. The command name needs to start with "global_restart_" and
|
||||
the remaining part of the name must coincide with the name of the service.
|
||||
For example:
|
||||
command[global_restart_hadoop-0.20-mapreduce-tasktracker]=/usr/bin/sudo /usr/sbin/service hadoop-0.20-mapreduce-tasktracker restart
|
||||
- Add a handler to the nagios service. The command needs the service name as parameter
|
||||
Example:
|
||||
event_handler restart-service!hadoop-0.20-mapreduce-tasktracker
|
||||
|
||||
---------------------------------------------------------------------------------
|
||||
|
@ -0,0 +1,32 @@
|
||||
|
||||
httpfs: We need to install it on only one machine (two for redundancy). Let's use the namenodes.
|
||||
|
||||
Move the second jobtracker on a dedicated machine.
|
||||
|
||||
hbase thrift: let's have two of them, on the nodes that run the hbase masters
|
||||
|
||||
Impala: needs to be installed on all the datanodes. After that, hue-impala can be installed on the hue server
|
||||
|
||||
NB: /etc/zookeeper/conf/zoo.cfg needs to be distributed on all
|
||||
datanodes.
|
||||
|
||||
Create the new disks: lvcreate -l 238465 -n node11.t.hadoop.research-infrastructures.eu-data-hdfs dlibsan6 /dev/md3
|
||||
# Move the data:
|
||||
rsync -qaxvH --delete --numeric-ids /mnt/disk/ dlibsan7:/mnt/disk/
|
||||
|
||||
----------
|
||||
dfs.socket.timeout, for read timeout
|
||||
dfs.datanode.socket.write.timeout, for write timeout
|
||||
|
||||
In fact, the read timeout value is used for various connections in
|
||||
DFSClient, if you only increase dfs.datanode.socket.write.timeout, the
|
||||
timeout can continue to happen.
|
||||
|
||||
I tried to generate 1TB data with teragen across more than 40 data
|
||||
nodes, increasing writing timeout has not fixed the problem. When I
|
||||
increased both values above 600000, it disappeared.
|
||||
----------
|
||||
|
||||
|
||||
To configure yarn:
|
||||
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.1/bk_installing_manually_book/content/rpm-chap1-11.html
|
@ -0,0 +1,2 @@
|
||||
monitorRole readonly
|
||||
controlRole readwrite
|
@ -0,0 +1,374 @@
|
||||
---
|
||||
# Generic machines data
|
||||
time_zone: 'Europe/Rome'
|
||||
cpu_cores: 8
|
||||
datanode_ram: 11000
|
||||
nagios_enabled: True
|
||||
ganglia_enabled: False
|
||||
ganglia_gmond_hdfs_datanodes_port: "8640:8660"
|
||||
ganglia_gmond_jobtracker_port: "8640:8660"
|
||||
ganglia_gmond_hbmaster_port: "8640:8660"
|
||||
ganglia_gmond_namenode_port: "8640:8660"
|
||||
configure_munin: True
|
||||
|
||||
# JDK (Oracle)
|
||||
jdk_version:
|
||||
- 7
|
||||
- 8
|
||||
jdk_default: 8
|
||||
java_home: '/usr/lib/jvm/java-{{ jdk_default }}-oracle'
|
||||
jdk_java_home: '{{ java_home }}'
|
||||
|
||||
# PKG state: latest or present. Set to 'latest' when you want to upgrade the installed packages version.
|
||||
hadoop_pkg_state: present
|
||||
#
|
||||
#
|
||||
# Global data
|
||||
#
|
||||
worker_nodes_num: 4
|
||||
worker_node_start: 2
|
||||
worker_node_end: 5
|
||||
worker_node_swappiness: 0
|
||||
|
||||
dns_domain: t.hadoop.research-infrastructures.eu
|
||||
namenode_hostname: 'nn1.{{ dns_domain }}'
|
||||
secondary_nm_hostname: 'nn2.{{ dns_domain }}'
|
||||
quorum_0_node_hostname: 'quorum0.{{ dns_domain }}'
|
||||
quorum_1_node_hostname: 'quorum1.{{ dns_domain }}'
|
||||
quorum_2_node_hostname: 'quorum2.{{ dns_domain }}'
|
||||
quorum_3_node_hostname: 'quorum3.{{ dns_domain }}'
|
||||
quorum_4_node_hostname: 'quorum4.{{ dns_domain }}'
|
||||
hbase_master_1_hostname: 'hbase-master1.{{ dns_domain }}'
|
||||
hbase_master_2_hostname: 'hbase-master2.{{ dns_domain }}'
|
||||
|
||||
ldap:
|
||||
server: ldap://ldap.sub.research-infrastructures.eu
|
||||
search_bind_auth: False
|
||||
username_pattern: "uid=<username>,ou=People,o=Users,ou=Organizations,dc=research-infrastructures,dc=eu"
|
||||
|
||||
hadoop_ldap_uri: ldap://ldap.sub.research-infrastructures.eu
|
||||
hadoop_ldap_base_dn: "dc=research-infrastructures,dc=eu"
|
||||
hadoop_ldap_search_bind_auth: False
|
||||
hadoop_ldap_username_pattern: "uid=<username>,ou=People,o=Users,ou=Organizations,dc=research-infrastructures,dc=eu"
|
||||
|
||||
#
|
||||
# LOGGING
|
||||
#
|
||||
# WARN,INFO,DEBUG,ERROR
|
||||
hadoop_log_level: INFO
|
||||
#
|
||||
# RFA is the rolling file appender
|
||||
hadoop_log_appender: RFA
|
||||
hadoop_log_appender_max_filesize: 256MB
|
||||
# max backup index is ignored if the appender is daily rolling file
|
||||
hadoop_log_appender_max_backupindex: 10
|
||||
#
|
||||
# We can use a logstash collector
|
||||
hadoop_send_to_logstash: False
|
||||
# Ditch the local appender if you want a logstash only solution
|
||||
hadoop_logstash_appender: RFA,LOGSTASH
|
||||
hadoop_logstash_collector_host: 'logstash.{{ dns_domain }}'
|
||||
hadoop_logstash_collector_socketappender_port: 4560
|
||||
hadoop_logstash_collector_socketappender_reconndelay: 10000
|
||||
#
|
||||
# rsyslog
|
||||
rsyslog_install_newer_package: True
|
||||
rsyslog_send_to_elasticsearch: False
|
||||
rsyslog_use_queues: False
|
||||
rsyslog_use_elasticsearch_module: False
|
||||
rsys_elasticsearch_collector_host: '{{ hadoop_logstash_collector_host }}'
|
||||
rsys_elasticsearch_collector_port: 9200
|
||||
|
||||
#
|
||||
# General hadoop
|
||||
#
|
||||
initialize_hadoop_cluster: False
|
||||
|
||||
hadoop_cluster_name: "nmis-hadoop-cluster"
|
||||
hadoop_data_dir: /data
|
||||
hadoop_conf_dir: '/etc/hadoop/conf.{{ hadoop_cluster_name|lower }}'
|
||||
hadoop_mapred_home: /usr/lib/hadoop-0.20-mapreduce
|
||||
|
||||
hadoop_hdfs_data_disk:
|
||||
- { mountpoint: '/data', device: 'xvda3', fstype: 'xfs' }
|
||||
#
|
||||
# Hadoop default heapsize
|
||||
# The default is 1000
|
||||
hadoop_default_heapsize: 1024
|
||||
hadoop_default_java_opts: "-server -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8"
|
||||
hadoop_jmx_enabled: False
|
||||
|
||||
#
|
||||
# HDFS
|
||||
#
|
||||
hdfs_cluster_id: '{{ hadoop_cluster_name }}'
|
||||
hdfs_cluster_nn_id_1: nn1
|
||||
hdfs_cluster_nn_id_2: nn2
|
||||
hdfs_cluster_ids: "{{ hdfs_cluster_nn_id_1 }},{{ hdfs_cluster_nn_id_2 }}"
|
||||
hdfs_namenode_1_hostname: '{{ namenode_hostname }}'
|
||||
hdfs_namenode_2_hostname: '{{ secondary_nm_hostname }}'
|
||||
hdfs_data_dir: '{{ hadoop_data_dir }}/dfs'
|
||||
hdfs_nn_data_dir: nn
|
||||
hdfs_dn_data_dir: dn
|
||||
hdfs_dn_balance_bandwidthPerSec: 2097152
|
||||
hdfs_support_append: "true"
|
||||
hdfs_nn_rpc_port: 8020
|
||||
hdfs_nn_http_port: 50070
|
||||
hdfs_nn_client_port: 57045
|
||||
# handler count. Recommended: ln(number of datanodes) * 20
|
||||
hdfs_nn_handler_count: 50
|
||||
# Recommended: up to 128MB, 134217728 bytes (this is the default, is a client parameter)
|
||||
hdfs_block_size: 16777216
|
||||
hdfs_repl_max: 256
|
||||
hdfs_replication: 1
|
||||
# Set to 0 to disable the trash use. Note that the client can enable it.
|
||||
hdfs_fs_trash_interval: 10060
|
||||
hdfs_datanode_max_xcievers: 1024
|
||||
hdfs_datanode_http_port: 50075
|
||||
hdfs_datanode_ipc_port: 50020
|
||||
hdfs_datanode_rpc_port: 50010
|
||||
hdfs_dfs_socket_timeout: 600000
|
||||
hdfs_dfs_socket_write_timeout: 600000
|
||||
# See http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_11_6.html
|
||||
hdfs_read_shortcircuit: True
|
||||
hdfs_read_shortcircuit_cache_size: 3000
|
||||
hdfs_read_shortcircuit_cache_expiry: 50000
|
||||
hdfs_read_shortcircuit_cache_dir: '/var/run/hadoop-hdfs'
|
||||
hdfs_journal_id: '{{ hdfs_cluster_id }}'
|
||||
hdfs_journal_port: 8485
|
||||
hdfs_journal_0: '{{ quorum_0_node_hostname }}'
|
||||
hdfs_journal_1: '{{ quorum_1_node_hostname }}'
|
||||
hdfs_journal_2: '{{ quorum_2_node_hostname }}'
|
||||
hdfs_journal_3: '{{ quorum_3_node_hostname }}'
|
||||
hdfs_journal_4: '{{ quorum_4_node_hostname }}'
|
||||
hdfs_journal_data_dir: jn
|
||||
hdfs_journal_http_port: 8480
|
||||
hdfs_zkfc_port: 8019
|
||||
hdfs_webhdfs_enabled: True
|
||||
hdfs_users_supergroup: supergroup
|
||||
# The following is used to retrieve the ssh key needed for the HA failover
|
||||
hdfs_user_home: /usr/lib/hadoop
|
||||
|
||||
httpfs_user: httpfs
|
||||
httpfs_host: 'hue.{{ dns_domain }}'
|
||||
httpfs_host_1: 'nn1.{{ dns_domain }}'
|
||||
httpfs_host_2: 'nn2.{{ dns_domain }}'
|
||||
httpfs_port: 14000
|
||||
httpfs_catalina_work_dir: /usr/lib/hadoop-httpfs/work
|
||||
|
||||
#
|
||||
# Zookeeper
|
||||
zookeeper_conf_dir: '/etc/zookeeper/conf.{{ hadoop_cluster_name|lower }}'
|
||||
zookeeper_log_dir: '/var/log/zookeeper'
|
||||
zookeeper_client_port: 2182
|
||||
zookeeper_quorum_port: 4182
|
||||
zookeeper_leader_port: 3182
|
||||
zookeeper_min_timeout: 30000
|
||||
zookeeper_max_timeout: 240000
|
||||
zookeeper_quorum_0: '{{ quorum_0_node_hostname }}'
|
||||
zookeeper_quorum_1: '{{ quorum_1_node_hostname }}'
|
||||
zookeeper_quorum_2: '{{ quorum_2_node_hostname }}'
|
||||
zookeeper_quorum_3: '{{ quorum_3_node_hostname }}'
|
||||
zookeeper_quorum_4: '{{ quorum_4_node_hostname }}'
|
||||
zookeeper_maxclient_connections: 240
|
||||
zookeeper_nodes: "{{ zookeeper_quorum_0 }},{{ zookeeper_quorum_1 }},{{ zookeeper_quorum_2 }},{{ zookeeper_quorum_3 }},{{ zookeeper_quorum_4 }}"
|
||||
zookeeper_cluster: "{{ zookeeper_quorum_0 }}:{{ zookeeper_client_port }},{{ zookeeper_quorum_1 }}:{{ zookeeper_client_port }},{{ zookeeper_quorum_2 }}:{{ zookeeper_client_port }},{{ zookeeper_quorum_3 }}:{{ zookeeper_client_port }},{{ zookeeper_quorum_4 }}:{{ zookeeper_client_port }}"
|
||||
|
||||
#
|
||||
# Jobtracker
|
||||
#
|
||||
jobtracker_cluster_id: nmis-hadoop-jt
|
||||
jobtracker_node_1_hostname: 'jobtracker.{{ dns_domain }}'
|
||||
jobtracker_node_2_hostname: 'jobtracker2.{{ dns_domain }}'
|
||||
jobtracker_cluster_id_1: jt1
|
||||
jobtracker_cluster_id_2: jt2
|
||||
jobtracker_cluster_id1_rpc_port: 8021
|
||||
jobtracker_cluster_id2_rpc_port: 8022
|
||||
jobtracker_cluster_id1_ha_rpc_port: 8023
|
||||
jobtracker_cluster_id2_ha_rpc_port: 8024
|
||||
jobtracker_cluster_id1_http_port: 50030
|
||||
jobtracker_cluster_id2_http_port: 50031
|
||||
jobtracker_http_port: 9290
|
||||
jobtracker_persistent_jobstatus: 'true'
|
||||
jobtracker_restart_recover: 'false'
|
||||
jobtracker_failover_connect_retries: 3
|
||||
jobtracker_auto_failover_enabled: 'true'
|
||||
jobtracker_zkfc_port: 8018
|
||||
# handler count. Recommended: ln(number of datanodes) * 20
|
||||
jobtracker_handler_count: 50
|
||||
|
||||
|
||||
# We have 12 nodes and 6 CPUs per node
|
||||
# reduce tasks forumla: 0.95 or 1.75 * (nodes * mapred.tasktracker.tasks.maximum)
|
||||
# Cloudera defaults: 2 mappers, 2 reducers max
|
||||
# ------
|
||||
# tested. too much stress on the hardware
|
||||
#mapred_tasktracker_map_tasks_maximum: 6
|
||||
#mapred_tasktracker_reduce_tasks_maximum: 68
|
||||
#mapred_reduce_child_java_opts: "-Xmx2G"
|
||||
# ------
|
||||
mapred_tasktracker_http_port: 50060
|
||||
mapred_tasktracker_map_tasks_maximum: 2
|
||||
mapred_tasktracker_reduce_tasks_maximum: 4
|
||||
mapred_use_fair_scheduler: True
|
||||
mapred_fair_scheduler_pools:
|
||||
- { name: 'solr', map: '12', reduce: '18' }
|
||||
mapred_fair_scheduler_use_poolnameproperty: True
|
||||
mapred_fair_scheduler_poolnameproperty: user.name
|
||||
mapred_fair_scheduler_undecl_pools: True
|
||||
mapred_fair_scheduler_preemption: False
|
||||
mapred_fair_scheduler_assignmultiple: True
|
||||
mapred_fair_scheduler_allocation_file: '{{ hadoop_conf_dir }}/fair-scheduler.xml'
|
||||
# reducer parallel copies. Recommended: ln(number of datanodes) * 4
|
||||
# with a minimum of 10
|
||||
mapred_reduce_parallel_copies: 10
|
||||
# Recommended: 80
|
||||
mapred_tasktracker_http_threads: 80
|
||||
# Default: 0.05. Recommended: 0.8. Used by the jobtracker
|
||||
mapred_reduce_slowstart_maps: 0.9
|
||||
# Default: 100. We could increase it
|
||||
mapred_tasktracker_io_sort_mb: 256
|
||||
mapred_io_sort_factor: 25
|
||||
mapreduce_job_counters_max: 5000
|
||||
mapred_userlog_retain_hours: 24
|
||||
mapred_jt_completeuserjobs_max: 150
|
||||
mapred_jt_persist_jobstatus_hours: 4320
|
||||
mapred_user_jobconf_limit: 5242880
|
||||
mapred_jt_retirejob_interval: 86400000
|
||||
mapreduce_jt_split_metainfo_maxsize: 10000000
|
||||
mapred_queue_names: default
|
||||
#
|
||||
mapred_staging_root_dir: /user
|
||||
mapred_old_staging_root_dir: /home
|
||||
mapred_local_dir: /data/mapred/local
|
||||
# Java parameters
|
||||
mapred_child_java_opts: "-Xmx3092M"
|
||||
mapred_map_child_java_opts: "-Xmx3092M"
|
||||
#mapred_reduce_child_java_opts: "-Xmx1512M"
|
||||
mapred_reduce_child_java_opts: "-Xmx2048M"
|
||||
|
||||
#
|
||||
# HBASE
|
||||
#
|
||||
# Raw formula to calculate the needed regionserver heap size:
|
||||
# regions.hbase.hregion.max.filesize /
|
||||
# hbase.hregion.memstore.flush.size *
|
||||
# dfs.replication *
|
||||
# hbase.regionserver.global.memstore.lowerLimit
|
||||
# See: http://hadoop-hbase.blogspot.it/2013/01/hbase-region-server-memory-sizing.html
|
||||
#
|
||||
hbase_user: hbase
|
||||
hbase_conf_dir: '/etc/hbase/conf.{{ hadoop_cluster_name|lower }}'
|
||||
# HBASE heap size
|
||||
hbase_master_heap_size: 5120
|
||||
hbase_thrift_heap_size: 1024
|
||||
hbase_regionserver_heap_size: 4500
|
||||
hbase_master_java_opts: '-Xmx{{ hbase_master_heap_size }}M'
|
||||
hbase_regionserver_maxdirectmemory_size: "-XX:MaxDirectMemorySize=2G"
|
||||
hbase_regionserver_java_opts: '-Xmx{{ hbase_regionserver_heap_size }}M'
|
||||
hbase_thrift_java_opts: '-Xmx{{ hbase_thrift_heap_size }}M'
|
||||
hbase_zookeeper_java_opts: -Xmx1G
|
||||
hbase_thrift_port: 9090
|
||||
hbase_thrift_jmx_port: 9591
|
||||
# hbase zookeeper timeout
|
||||
hbase_zookeeper_timeout: '{{ zookeeper_max_timeout }}'
|
||||
# rpc timeout needs to be greater than lease period
|
||||
# See http://hbase.apache.org/book/trouble.client.html
|
||||
hbase_rpc_timeout: 600000
|
||||
hbase_lease_period: 400000
|
||||
hbase_open_files: 65536
|
||||
hbase_master_rpc_port: 60000
|
||||
hbase_master_http_port: 60010
|
||||
hbase_regionserver_http_port: 60030
|
||||
hbase_regionserver_http_1_port: 60020
|
||||
# This is controversial. When set to 'true' hdfs balances
|
||||
# each table without paying attention to the global balancing
|
||||
hbase_loadbalance_bytable: True
|
||||
# Default is 0.2
|
||||
hbase_regions_slop: 0.15
|
||||
# Default is 10. The recommendation is to keep it low when the payload per request grows
|
||||
# We have mixed payloads.
|
||||
hbase_handler_count: 12
|
||||
# Default was 256M. It's 10737418240 (10GB) since 0.94
|
||||
# The recommendation is to have it big to decrease the total number of regions
|
||||
# 1288490188 is circa 1.2GB
|
||||
hbase_hregion_max_file_size: 1288490188
|
||||
hbase_hregion_memstore_mslab_enabled: True
|
||||
# The default 134217728, 128MB. We set it to 256M
|
||||
hbase_hregion_memstore_flush_size: 268435456
|
||||
# The default is 0.4
|
||||
hbase_regionserver_global_memstore_lowerLimit: 0.35
|
||||
#
|
||||
hbase_regionserver_global_memstore_upperLimit: 0.45
|
||||
hbase_hregion_memstore_block_multiplier: 3
|
||||
# HBASE thrift server
|
||||
hbase_thrift_server_1: '{{ hbase_master_1_hostname }}'
|
||||
hbase_thrift_server_2: '{{ hbase_master_2_hostname }}'
|
||||
|
||||
#
|
||||
# nginx uses as reverse proxy to all the web interfaces
|
||||
#
|
||||
nginx_use_ldap_pam_auth: True
|
||||
nginx_pam_svc_name: nginx
|
||||
nginx_ldap_uri: '{{ hadoop_ldap_uri }}'
|
||||
nginx_ldap_base_dn: '{{ hadoop_ldap_base_dn }}'
|
||||
|
||||
portal_nginx_conf: management-portal
|
||||
portal_pam_svc_name: '{{ nginx_pam_svc_name }}'
|
||||
portal_title: "NeMIS Hadoop Cluster"
|
||||
portal_web_root: /usr/share/nginx/www
|
||||
|
||||
#
|
||||
# OOZIE and HIVE DB data
|
||||
#
|
||||
oozie_db_type: postgresql
|
||||
oozie_db_name: oozie
|
||||
oozie_db_user: oozie
|
||||
oozie_db_host: db.t.hadoop.research-infrastructures.eu
|
||||
hive_db_type: '{{ oozie_db_type }}'
|
||||
hive_db_name: hive
|
||||
hive_db_user: hive
|
||||
hive_db_host: '{{ oozie_db_host }}'
|
||||
hive_metastore_db_type: '{{ oozie_db_type }}'
|
||||
hive_metastore_db_name: metastore
|
||||
hive_metastore_db_user: metastore
|
||||
hive_metastore_db_host: '{{ oozie_db_host }}'
|
||||
hue_db_type: '{{ oozie_db_type }}'
|
||||
hue_db_name: hue
|
||||
hue_db_user: hue
|
||||
hue_db_host: '{{ oozie_db_host }}'
|
||||
hue_http_port: 8888
|
||||
oozie_ip: 146.48.123.66
|
||||
hive_ip: '{{ oozie_ip }}'
|
||||
hue_ip: '{{ oozie_ip }}'
|
||||
|
||||
# Iptables
|
||||
|
||||
other_networks:
|
||||
# Marek
|
||||
icm_pl: 213.135.59.0/24
|
||||
# eri.katsari
|
||||
icm_pl_1: 195.134.66.216/32
|
||||
# Antonis addresses, need to reach hdfs and zookeeper (ARC). And Glykeria Katsari
|
||||
ilsp_gr: [ '194.177.192.226/32', '194.177.192.223/32', '195.134.66.96/32', '194.177.192.218/32', '194.177.192.231/32', '195.134.66.216/32', '195.134.66.145/32', '194.177.192.118/32', '195.134.66.244' ]
|
||||
# Needed by marek. It's the IIS cluster gateway.
|
||||
iis_pl_1: 213.135.60.74/32
|
||||
# Jochen
|
||||
icm_1: 129.70.43.118/32
|
||||
|
||||
monitoring_group_name: hadoop-cluster
|
||||
|
||||
nagios_local_plugins_dir: /usr/lib/nagios/plugins/hadoop
|
||||
nagios_common_lib: check_library.sh
|
||||
nagios_monitoring_dir: '/etc/nagios3/objects/{{ monitoring_group_name }}'
|
||||
nagios_root_disk: /
|
||||
nagios_check_disk_w: 10%
|
||||
nagios_check_disk_c: 7%
|
||||
nagios_service_contacts:
|
||||
- andrea.dellamico
|
||||
- claudio.atzori
|
||||
nagios_contactgroup: hadoop-managers
|
||||
nagios_monitoring_server_ip: 146.48.123.23
|
||||
|
||||
iptables_default_policy: REJECT
|
||||
|
@ -0,0 +1,27 @@
|
||||
---
|
||||
# Ganglia
|
||||
ganglia_unicast_mode: False
|
||||
|
||||
ganglia_gmond_jobtracker_cluster: "Openaire+ Hadoop Cluster - Jobtrackers"
|
||||
ganglia_gmond_namenode_cluster: "Openaire+ Hadoop Cluster - HDFS namenodes"
|
||||
ganglia_gmond_hbmaster_cluster: "Openaire+ Hadoop Cluster - HBASE masters"
|
||||
ganglia_gmond_workers_cluster: "Openaire+ Hadoop Cluster - Worker nodes"
|
||||
|
||||
ganglia_gmond_cluster: '{{ ganglia_gmond_workers_cluster }}'
|
||||
|
||||
#
|
||||
# To play nice with iptables
|
||||
ganglia_gmond_mcast_addr: 239.2.11.0
|
||||
ganglia_gmond_cluster_port: "8640:8660"
|
||||
|
||||
# jmx ports
|
||||
hadoop_namenode_jmx_port: 10103
|
||||
hadoop_secondary_namenode_jmx_port: 10104
|
||||
hadoop_datanode_jmx_port: 10105
|
||||
hadoop_balancer_jmx_port: 10106
|
||||
hadoop_jobtracker_jmx_port: 10107
|
||||
hbase_master_jmx_port: 10101
|
||||
hbase_regionserver_jmx_port: 10102
|
||||
hbase_thrift_jmx_port: 10109
|
||||
hbase_zookeeper_jmx_port: 10110
|
||||
zookeeper_jmx_port: 10108
|
@ -0,0 +1,32 @@
|
||||
---
|
||||
# jmx ports
|
||||
hadoop_namenode_jmx_port: 10103
|
||||
hadoop_secondary_namenode_jmx_port: 10104
|
||||
hadoop_datanode_jmx_port: 10105
|
||||
hadoop_balancer_jmx_port: 10106
|
||||
hadoop_jobtracker_jmx_port: 10107
|
||||
hbase_master_jmx_port: 10101
|
||||
hbase_regionserver_jmx_port: 10102
|
||||
hbase_thrift_jmx_port: 10109
|
||||
hbase_zookeeper_jmx_port: 10110
|
||||
zookeeper_jmx_port: 10108
|
||||
#
|
||||
# Used by nagios
|
||||
hadoop_plugins_dir: /usr/lib/nagios/plugins/hadoop
|
||||
root_disk: /dev/xvda2
|
||||
data_disk: /dev/xvda3
|
||||
root_disk_warn: 20%
|
||||
disk_warn: '{{ root_disk_warn }}'
|
||||
root_disk_crit: 10%
|
||||
disk_crit: '{{ root_disk_crit }}'
|
||||
data_disk_warn: 7%
|
||||
data_disk_crit: 4%
|
||||
|
||||
hbase_check_user: hbasecheck
|
||||
hbase_check_timeout: 560
|
||||
hdfs_warn: 90
|
||||
hdfs_crit: 95
|
||||
|
||||
nagios_proclist_red: '{{ redprocs }}'
|
||||
nagios_proclist_yellow: '{{ yellowprocs }}'
|
||||
nagios_nrpe_port: 5666
|
@ -0,0 +1,23 @@
|
||||
---
|
||||
#
|
||||
# The OOZIE users are a subset of the hdfs users.
|
||||
#
|
||||
hadoop_users:
|
||||
- { login: 'marek.horst', name: "Marek Horst", ssh_key: '{{ marek_horst }}', shell: '/bin/bash' }
|
||||
- { login: 'claudio.atzori', name: "Claudio Atzori", ssh_key: '{{ claudio_atzori }}', shell: '/bin/bash' }
|
||||
- { login: 'sandro.labruzzo', name: "Sandro Labruzzo", ssh_key: '{{ sandro_labruzzo }}', shell: '/bin/bash' }
|
||||
- { login: 'michele.artini', name: "Michele Artini", ssh_key: '{{ michele_artini }}', shell: '/bin/bash' }
|
||||
- { login: 'alessia.bardi', name: "Alessia Bardi", ssh_key: '{{ alessia_bardi }}', shell: '/bin/bash' }
|
||||
- { login: 'andrea.mannocci', name: "Andrea Mannocci", ssh_key: '{{ andrea_mannocci }}', shell: '/bin/bash' }
|
||||
- { login: 'andrea.dellamico', name: "Andrea Dell'Amico", ssh_key: '{{ andrea_dellamico }}', shell: '/bin/bash' }
|
||||
- { login: 'giorgos.alexiou', name: "Giorgos Alexiou", ssh_key: '{{ giorgos_alexiou }}', shell: '/bin/bash' }
|
||||
- { login: 'antonis.lempesis', name: "Antonis Lempesis", ssh_key: '{{ antonis_lempesis }}', shell: '/bin/bash' }
|
||||
- { login: 'dnet' }
|
||||
- { login: 'claudio' }
|
||||
- { login: 'michele' }
|
||||
- { login: 'sandro' }
|
||||
- { login: 'alessia' }
|
||||
- { login: 'andrea' }
|
||||
- { login: 'adellam' }
|
||||
- { login: 'hbasecheck' }
|
||||
|
@ -0,0 +1,6 @@
|
||||
$ANSIBLE_VAULT;1.1;AES256
|
||||
63613435386665626236306331353063626137386531346461646463623436376232303461653436
|
||||
3934313830326366373364396630356630623935633230360a646439346530363762363966643534
|
||||
30373331666537666266353666333632616465666331383231356661633838633432656536653233
|
||||
3738636134393763650a623637326339653932323563346336366433333732373733656532353137
|
||||
36306364343430303535373961646632656535666162363862613036356461343865
|
@ -0,0 +1,10 @@
|
||||
$ANSIBLE_VAULT;1.1;AES256
|
||||
39646636653439616665643935326563653435646462306639646266376232633436393834643933
|
||||
3364336430396530646637383438663037366362663135320a373065343862653035653838323739
|
||||
61646135626431643330363963666433303737663464396663353632646339653562666162393034
|
||||
3363383435346364310a356439323431343336366635306461613462663436326431383266366231
|
||||
39636262313038366135316331343939373064356336356239653631633435613736306131656363
|
||||
37613864353931396435353431633765623330663266646666643632626666643436623939303538
|
||||
34343461383338663466303131663336326230666532326335373862636437343739336136616435
|
||||
35653763353436383537633932316434303539373237336161303165353962356336666161323765
|
||||
6336
|
@ -0,0 +1,18 @@
|
||||
---
|
||||
psql_version: 9.1
|
||||
psql_db_host: localhost
|
||||
psql_db_data:
|
||||
- { name: '{{ oozie_db_name }}', encoding: 'UTF8', user: '{{ oozie_db_user }}', roles: 'CREATEDB,NOSUPERUSER', pwd: '{{ psql_db_pwd }}', allowed_hosts: [ '{{ oozie_ip }}/32' ] }
|
||||
- { name: '{{ hue_db_name }}', encoding: 'UTF8', user: '{{ hue_db_user }}', roles: 'CREATEDB,NOSUPERUSER', pwd: '{{ psql_db_pwd }}', allowed_hosts: [ '{{ hue_ip }}/32' ] }
|
||||
- { name: '{{ hive_metastore_db_name }}', encoding: 'UTF8', user: '{{ hive_metastore_db_user }}', roles: 'CREATEDB,NOSUPERUSER', pwd: '{{ psql_db_pwd }}', allowed_hosts: [ '{{ hive_ip }}/32' ] }
|
||||
psql_listen_on_ext_int: True
|
||||
|
||||
pg_backup_pgdump_bin: /usr/lib/postgresql/9.1/bin/pg_dump
|
||||
pg_backup_retain_copies: 10
|
||||
pg_backup_build_db_list: "no"
|
||||
pg_backup_db_list: "'{{ oozie_db_name }}' '{{ hue_db_name }}' '{{ hive_metastore_db_name }}'"
|
||||
pg_backup_destdir: /data/pgsql/backups
|
||||
pg_backup_logfile: '{{ pg_backup_logdir }}/postgresql-backup.log'
|
||||
pg_backup_use_nagios: "yes"
|
||||
|
||||
user_ssh_key: [ '{{ claudio_atzori }}' ]
|
@ -0,0 +1,2 @@
|
||||
---
|
||||
user_ssh_key: [ '{{ claudio_atzori }}', '{{ hadoop_test_cluster }}', '{{ sandro_labruzzo }}' ]
|
@ -0,0 +1,18 @@
|
||||
---
|
||||
#
|
||||
# The hadoop logs are now sent to logstash directly by log4j
|
||||
# - adellam 2015-02-04
|
||||
#
|
||||
# the log_state_file names must be unique when using the old rsyslog syntax. In the new one
|
||||
# they are not used
|
||||
# rsys_logfiles:
|
||||
# - { logfile: '/var/log/hadoop-0.20-mapreduce/hadoop-{{ hadoop_cluster_name }}-jobtrackerha-{{ ansible_hostname }}.log', log_tag: 'hadoop-jobtracker', log_state_file: 'hadoop-jobtracker'}
|
||||
# - { logfile: '/var/log/hadoop-0.20-mapreduce/hadoop-{{ hadoop_cluster_name }}-mrzkfc-{{ ansible_hostname }}.log', log_tag: 'hadoop-jt-mrzkfc', log_state_file: 'hadoop-jt-mrzkfc'}
|
||||
# - { logfile: '/var/log/hadoop-0.20-mapreduce/mapred-audit.log', log_tag: 'hadoop-mapred-audit', log_state_file: 'hadoop-mapred-audit'}
|
||||
# - { logfile: '/var/log/hadoop-hdfs/hadoop-{{ hadoop_cluster_name }}-namenode-{{ ansible_hostname }}.log', log_tag: 'hadoop-hdfs-namenode', log_state_file: 'hadoop-hdfs-namenode'}
|
||||
# - { logfile: '/var/log/hadoop-hdfs/hadoop-{{ hadoop_cluster_name }}-zkfc-{{ ansible_hostname }}.log', log_tag: 'hadoop-hdfs-zkfc', log_state_file: 'hadoop-hdfs-zkfc'}
|
||||
# - { logfile: '/var/log/hadoop-hdfs/hadoop-{{ hadoop_cluster_name }}-journalnode-{{ ansible_hostname }}.log', log_tag: 'hadoop-hdfs-journal', log_state_file: 'hadoop-hdfs-journal'}
|
||||
# - { logfile: '/var/log/hbase/hbase.log', log_tag: 'hbase-master-log', log_state_file: 'hbase-master-log'}
|
||||
# - { logfile: '/var/log/hbase/hbase-hbase-master-{{ ansible_hostname }}.log', log_tag: 'hbase-master-ha', log_state_file: 'hbase-master-ha'}
|
||||
# - { logfile: '/var/log/hbase/hbase-hbase-thrift-{{ ansible_hostname }}.log', log_tag: 'hbase-thrift', log_state_file: 'hbase-thrift'}
|
||||
# - { logfile: '{{ zookeeper_log_dir }}/zookeeper.log', log_tag: 'hadoop-zookeeper', log_state_file: 'hadoop-zookeeper'}
|
@ -0,0 +1,6 @@
|
||||
---
|
||||
# Ganglia gmond port
|
||||
ganglia_gmond_cluster: '{{ ganglia_gmond_workers_cluster }}'
|
||||
ganglia_gmond_cluster_port: '{{ ganglia_gmond_hdfs_datanodes_port }}'
|
||||
ganglia_gmond_mcast_address: '{{ ganglia_gmond_workers_mcast_addr }}'
|
||||
|
@ -0,0 +1,10 @@
|
||||
---
|
||||
iptables:
|
||||
tcp_rules: True
|
||||
tcp:
|
||||
- { port: '{{ hdfs_datanode_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}' ] }
|
||||
- { port: '{{ hdfs_datanode_ipc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}' ] }
|
||||
- { port: '{{ hdfs_datanode_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
|
||||
- { port: '{{ mapred_tasktracker_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ hbase_regionserver_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ hbase_regionserver_http_1_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
|
@ -0,0 +1,10 @@
|
||||
---
|
||||
#
|
||||
# The hadoop logs are now sent to logstash directly by log4j
|
||||
# - adellam 2015-02-04
|
||||
#
|
||||
# IMPORTANT: the log_state_file names must be unique
|
||||
# rsys_logfiles:
|
||||
# - { logfile: '/var/log/hadoop-0.20-mapreduce/hadoop-{{ hadoop_cluster_name }}-tasktracker-{{ ansible_hostname }}.log', log_tag: 'hadoop-tasktracker', log_state_file: 'hadoop-tasktracker'}
|
||||
# - { logfile: '/var/log/hadoop-hdfs/hadoop-{{ hadoop_cluster_name }}-datanode-{{ ansible_hostname }}.log', log_tag: 'hadoop-hdfs-datanode', log_state_file: 'hadoop-hdfs-datanode'}
|
||||
# - { logfile: '/var/log/hbase/hbase-hbase-regionserver-{{ ansible_hostname }}.log', log_tag: 'hbase-regionserver', log_state_file: 'hbase-regionserver'}
|
@ -0,0 +1,6 @@
|
||||
---
|
||||
# Ganglia gmond port
|
||||
ganglia_gmond_cluster: '{{ ganglia_gmond_hbmaster_cluster }}'
|
||||
ganglia_gmond_cluster_port: '{{ ganglia_gmond_hbmaster_port }}'
|
||||
ganglia_gmond_mcast_address: '{{ ganglia_gmond_hbmaster_mcast_addr }}'
|
||||
|
@ -0,0 +1,12 @@
|
||||
---
|
||||
iptables:
|
||||
tcp_rules: True
|
||||
tcp:
|
||||
- { port: '{{ hbase_master_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
|
||||
- { port: '{{ hbase_master_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
|
||||
- { port: '{{ hbase_thrift_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ hdfs_journal_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ hdfs_journal_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ zookeeper_leader_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ zookeeper_quorum_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ zookeeper_client_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
|
@ -0,0 +1,6 @@
|
||||
---
|
||||
# Ganglia gmond port
|
||||
ganglia_gmond_cluster: '{{ ganglia_gmond_namenode_cluster }}'
|
||||
ganglia_gmond_cluster_port: '{{ ganglia_gmond_namenode_port }}'
|
||||
ganglia_gmond_mcast_address: '{{ ganglia_gmond_namenode_mcast_addr }}'
|
||||
|
@ -0,0 +1,13 @@
|
||||
---
|
||||
iptables:
|
||||
tcp_rules: True
|
||||
tcp:
|
||||
- { port: '{{ hdfs_nn_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
|
||||
- { port: '{{ hdfs_nn_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
|
||||
- { port: '{{ hdfs_nn_client_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ hdfs_zkfc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ hdfs_journal_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ hdfs_journal_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ zookeeper_leader_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ zookeeper_quorum_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ zookeeper_client_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
|
@ -0,0 +1,6 @@
|
||||
---
|
||||
# Ganglia gmond port
|
||||
ganglia_gmond_cluster: '{{ ganglia_gmond_jobtracker_cluster }}'
|
||||
ganglia_gmond_cluster_port: '{{ ganglia_gmond_jobtracker_port }}'
|
||||
ganglia_gmond_mcast_address: '{{ ganglia_gmond_jobtracker_mcast_addr }}'
|
||||
|
@ -0,0 +1,22 @@
|
||||
---
|
||||
iptables:
|
||||
tcp_rules: True
|
||||
tcp:
|
||||
- { port: '80:95' }
|
||||
- { port: '8100:8150' }
|
||||
- { port: '8200:8250' }
|
||||
- { port: '8300:8350' }
|
||||
- { port: '8400:8450' }
|
||||
- { port: '{{ jobtracker_cluster_id1_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_1 }}' ] }
|
||||
- { port: '{{ jobtracker_cluster_id2_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_1 }}' ] }
|
||||
- { port: '{{ jobtracker_cluster_id1_ha_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ jobtracker_cluster_id2_ha_rpc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ jobtracker_cluster_id1_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.icm_1 }}' ] }
|
||||
- { port: '{{ jobtracker_cluster_id2_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.icm_1 }}' ] }
|
||||
- { port: '{{ jobtracker_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ jobtracker_zkfc_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ hdfs_journal_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ hdfs_journal_http_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ zookeeper_leader_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ zookeeper_quorum_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '{{ zookeeper_client_port }}', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.ilsp_gr }}', '{{ other_networks.iis_pl_1 }}', '{{ other_networks.icm_1 }}' ] }
|
@ -0,0 +1,52 @@
|
||||
---
|
||||
user_ssh_key: [ '{{ claudio_atzori }}' ]
|
||||
|
||||
|
||||
# logstash
|
||||
logstash_collector_host: logstash.t.hadoop.research-infrastructures.eu
|
||||
logstash_collector_listen_port: 5544
|
||||
logstash_version: 1.3.3
|
||||
logstash_file: 'logstash-{{ logstash_version }}-flatjar.jar'
|
||||
logstash_url: 'download.elasticsearch.org/logstash/logstash/{{ logstash_file }}'
|
||||
logstash_install_dir: /opt/logstash
|
||||
logstash_conf_dir: '{{ logstash_install_dir }}/etc'
|
||||
logstash_lib_dir: '{{ logstash_install_dir }}/share'
|
||||
logstash_log_dir: /var/log/logstash
|
||||
logstash_user: logstash
|
||||
logstash_indexer_jvm_opts: "-Xms2048m -Xmx2048m"
|
||||
|
||||
kibana_nginx_conf: kibana
|
||||
kinaba_nginx_root: /var/www/kibana/src
|
||||
kibana_virtual_host: logs.t.hadoop.research-infrastructures.eu
|
||||
|
||||
elasticsearch_user: elasticsearch
|
||||
elasticsearch_group: elasticsearch
|
||||
elasticsearch_version: 1.0.0
|
||||
elasticsearch_http_port: 9200
|
||||
elasticsearch_transport_tcp_port: 9300
|
||||
elasticsearch_download_path: download.elasticsearch.org/elasticsearch/elasticsearch
|
||||
elasticsearch_cluster: hadoop-logstash
|
||||
elasticsearch_node_name: logstash
|
||||
elasticsearch_node_master: "true"
|
||||
elasticsearch_node_data: "true"
|
||||
elasticsearch_max_local_storage_nodes: 1
|
||||
elasticsearch_log_dir: /var/log/elasticsearch
|
||||
elasticsearch_heap_size: 5
|
||||
elasticsearch_host: localhost
|
||||
elasticsearch_curator_close_after: 10
|
||||
elasticsearch_curator_retain_days: 20
|
||||
elasticsearch_curator_optimize_days: 10
|
||||
elasticsearch_curator_bloom_days: 7
|
||||
elasticsearch_curator_timeout: 1200
|
||||
elasticsearch_curator_manage_marvel: True
|
||||
elasticsearch_disable_dynamic_scripts: True
|
||||
|
||||
# We use the nginx defaults here
|
||||
nginx_use_ldap_pam_auth: True
|
||||
|
||||
iptables:
|
||||
tcp_rules: True
|
||||
tcp:
|
||||
- { port: '{{ logstash.collector_listen_port }}', allowed_hosts: [ '{{ network.nmis }}' ] }
|
||||
- { port: '{{ elasticsearch.http_port }}', allowed_hosts: [ '{{ ansible_fqdn }}', '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
||||
- { port: '80', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}' ] }
|
@ -0,0 +1,43 @@
|
||||
---
|
||||
user_ssh_key: [ '{{ claudio_atzori }}', '{{ michele_artini }}' ]
|
||||
|
||||
iptables:
|
||||
tcp_rules: True
|
||||
tcp:
|
||||
- { port: '11000', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_pl }}', '{{ other_networks.icm_pl_1 }}' ] }
|
||||
- { port: '10000', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_pl }}', '{{ other_networks.icm_pl_1 }}' ] }
|
||||
- { port: '9083', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_pl }}', '{{ other_networks.icm_pl_1 }}' ] }
|
||||
- { port: '8888', allowed_hosts: [ '{{ network.isti }}', '{{ network.nmis }}', '{{ network.eduroam }}', '{{ other_networks.icm_pl }}', '{{ other_networks.icm_pl_1 }}', '0.0.0.0/0' ] }
|
||||
|
||||
oozie:
|
||||
host: 'oozie.{{ dns_domain }}'
|
||||
conf_dir: /etc/oozie/conf
|
||||
user: oozie
|
||||
catalina_work_dir: /usr/lib/oozie/oozie-server-0.20/work
|
||||
http_port: 11000
|
||||
|
||||
#
|
||||
# HIVE
|
||||
#
|
||||
hive:
|
||||
host: 'hive.{{ dns_domain }}'
|
||||
conf_dir: /etc/hive/conf
|
||||
user: hive
|
||||
metastore_port: 9083
|
||||
server2_http_port: 10000
|
||||
setugi: True
|
||||
|
||||
#
|
||||
# HUE
|
||||
#
|
||||
hue:
|
||||
user: hue
|
||||
group: hue
|
||||
host: 'hue.{{ dns_domain }}'
|
||||
http_port: 8888
|
||||
conf_dir: /etc/hue
|
||||
hive_interface: hiveserver2
|
||||
exec_path: /usr/share/hue/build/env/bin/hue
|
||||
encoding: 'utf-8'
|
||||
setuid_path: /usr/share/hue/apps/shell/src/shell/build/setuid
|
||||
|
@ -0,0 +1,8 @@
|
||||
$ANSIBLE_VAULT;1.1;AES256
|
||||
35656164616131366466393935373064383333633237616435353030613234323463393363643961
|
||||
6366343466396563666662396332666661636462313861630a376235623035633530656238623464
|
||||
37636231343837363431396564363632343466306166343365356137646266656637313534353834
|
||||
3561323334346135300a643731653463353564356332376162613864336539376530333534363032
|
||||
36643532626433393939353030653762643636353331326565666164343761393533623461383165
|
||||
33313736346537373364646332653538343034376639626335393065346637623664303264343237
|
||||
326630336139303531346238383733633335
|
@ -0,0 +1,17 @@
|
||||
---
|
||||
- hosts: hadoop_worker_nodes:hadoop_masters
|
||||
remote_user: root
|
||||
max_fail_percentage: 10
|
||||
serial: "25%"
|
||||
|
||||
# vars_files:
|
||||
# - ../library/vars/isti-global.yml
|
||||
roles:
|
||||
- common
|
||||
- cdh_common
|
||||
- chkconfig
|
||||
- hadoop_common
|
||||
- hadoop_config
|
||||
- hadoop_zookeeper
|
||||
- hadoop_zookeeper_config
|
||||
|
@ -0,0 +1,10 @@
|
||||
[zookeeper_cluster]
|
||||
quorum0.t.hadoop.research-infrastructures.eu zoo_id=0
|
||||
quorum1.t.hadoop.research-infrastructures.eu zoo_id=1
|
||||
quorum2.t.hadoop.research-infrastructures.eu zoo_id=2
|
||||
quorum3.t.hadoop.research-infrastructures.eu zoo_id=3
|
||||
quorum4.t.hadoop.research-infrastructures.eu zoo_id=4
|
||||
|
||||
[monitoring]
|
||||
monitoring.research-infrastructures.eu
|
||||
|
@ -0,0 +1,14 @@
|
||||
---
|
||||
- hosts: monitoring
|
||||
user: root
|
||||
vars_files:
|
||||
- ../library/vars/isti-global.yml
|
||||
roles:
|
||||
- nagios-server
|
||||
|
||||
- hosts: hadoop_cluster:other_services:db
|
||||
user: root
|
||||
vars_files:
|
||||
- ../library/vars/isti-global.yml
|
||||
roles:
|
||||
- nagios-monitoring
|
@ -0,0 +1,4 @@
|
||||
---
|
||||
dependencies:
|
||||
- { role: ../library/roles/openjdk }
|
||||
- role: '../../library/roles/ssh-keys'
|
@ -0,0 +1,14 @@
|
||||
---
|
||||
- name: Install the common CDH hadoop packages
|
||||
apt: pkg={{ item }} state={{ hadoop_pkg_state }}
|
||||
with_items:
|
||||
- hadoop
|
||||
- hadoop-0.20-mapreduce
|
||||
- hadoop-client
|
||||
- hadoop-hdfs
|
||||
- hadoop-mapreduce
|
||||
tags:
|
||||
- hadoop
|
||||
- mapred
|
||||
- hdfs
|
||||
|
@ -0,0 +1,23 @@
|
||||
---
|
||||
- name: Install the D-NET repository key
|
||||
action: apt_key url=http://ppa.research-infrastructures.eu/dnet/keys/dnet-archive.asc
|
||||
tags:
|
||||
- hadoop
|
||||
- cdh
|
||||
|
||||
- name: Install the CDH repository key
|
||||
action: apt_key url=http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/archive.key
|
||||
tags:
|
||||
- hadoop
|
||||
- cdh
|
||||
|
||||
- apt_repository: repo='{{ item }}' update_cache=yes
|
||||
with_items:
|
||||
- deb http://ppa.research-infrastructures.eu/dnet lucid main
|
||||
- deb http://ppa.research-infrastructures.eu/dnet unstable main
|
||||
- deb [arch=amd64] http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4 contrib
|
||||
- deb [arch=amd64] http://archive.cloudera.com/gplextras/ubuntu/precise/amd64/gplextras precise-gplextras4 contrib
|
||||
register: update_apt_cache
|
||||
tags:
|
||||
- hadoop
|
||||
- cdh
|
@ -0,0 +1,4 @@
|
||||
---
|
||||
- import_tasks: cdh-setup.yml
|
||||
- import_tasks: cdh-pkgs.yml
|
||||
# See meta/main.yml for the involved library playbooks
|
@ -0,0 +1,4 @@
|
||||
---
|
||||
dependencies:
|
||||
- role: '../../library/roles/ubuntu-deb-general'
|
||||
- { role: '../../library/roles/iptables', when: iptables is defined }
|
@ -0,0 +1,2 @@
|
||||
---
|
||||
# See meta/main.yml for the involved library playbooks
|
@ -0,0 +1,3 @@
|
||||
---
|
||||
dependencies:
|
||||
- role: '../../library/roles/ganglia'
|
@ -0,0 +1,31 @@
|
||||
---
|
||||
# See meta/main.yml for the basic installation and configuration steps
|
||||
# The hadoop conf directory always exists
|
||||
- name: Distribute the ganglia hadoop metrics properties
|
||||
template: src={{ item }}.j2 dest={{ hadoop_conf_dir }}/{{ item }} owner=root group=root mode=444
|
||||
with_items:
|
||||
- hadoop-metrics.properties
|
||||
- hadoop-metrics2.properties
|
||||
tags: [ 'monitoring', 'ganglia', 'ganglia_conf' ]
|
||||
|
||||
- name: Check if the hbase conf directory exists
|
||||
stat: path={{ hbase_conf_dir }}
|
||||
register: check_hbase_confdir
|
||||
tags: [ 'monitoring', 'ganglia', 'ganglia_conf' ]
|
||||
|
||||
- name: Distribute the ganglia hbase metrics properties
|
||||
template: src={{ item }}.properties.j2 dest={{ hbase_conf_dir }}/{{ item }}-hbase.properties owner=root group=root mode=444
|
||||
with_items:
|
||||
- hadoop-metrics
|
||||
- hadoop-metrics2
|
||||
when: check_hbase_confdir.stat.exists
|
||||
tags: [ 'monitoring', 'ganglia', 'ganglia_conf' ]
|
||||
|
||||
- name: Distribute the ganglia hbase metrics properties, maintain the old file name
|
||||
file: src={{ hbase_conf_dir }}/{{ item }}-hbase.properties dest={{ hbase_conf_dir }}/{{ item }}.properties state=link force=yes
|
||||
with_items:
|
||||
- hadoop-metrics
|
||||
- hadoop-metrics2
|
||||
when: check_hbase_confdir.stat.exists
|
||||
tags: [ 'monitoring', 'ganglia', 'ganglia_conf' ]
|
||||
|
@ -0,0 +1,96 @@
|
||||
# Configuration of the "dfs" context for null
|
||||
dfs.class=org.apache.hadoop.metrics.spi.NullContext
|
||||
|
||||
# Configuration of the "dfs" context for file
|
||||
#dfs.class=org.apache.hadoop.metrics.file.FileContext
|
||||
#dfs.period=10
|
||||
#dfs.fileName=/tmp/dfsmetrics.log
|
||||
|
||||
# Configuration of the "dfs" context for ganglia
|
||||
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
|
||||
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
|
||||
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# dfs.period=10
|
||||
# dfs.servers=localhost:8649
|
||||
|
||||
|
||||
# Configuration of the "mapred" context for null
|
||||
mapred.class=org.apache.hadoop.metrics.spi.NullContext
|
||||
|
||||
# Configuration of the "mapred" context for file
|
||||
#mapred.class=org.apache.hadoop.metrics.file.FileContext
|
||||
#mapred.period=10
|
||||
#mapred.fileName=/tmp/mrmetrics.log
|
||||
|
||||
# Configuration of the "mapred" context for ganglia
|
||||
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
|
||||
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
|
||||
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# mapred.period=10
|
||||
# mapred.servers=localhost:8649
|
||||
|
||||
|
||||
# Configuration of the "jvm" context for null
|
||||
#jvm.class=org.apache.hadoop.metrics.spi.NullContext
|
||||
|
||||
# Configuration of the "jvm" context for file
|
||||
#jvm.class=org.apache.hadoop.metrics.file.FileContext
|
||||
#jvm.period=10
|
||||
#jvm.fileName=/tmp/jvmmetrics.log
|
||||
|
||||
# Configuration of the "jvm" context for ganglia
|
||||
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
|
||||
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# jvm.period=10
|
||||
# jvm.servers=localhost:8649
|
||||
|
||||
# Configuration of the "rpc" context for null
|
||||
rpc.class=org.apache.hadoop.metrics.spi.NullContext
|
||||
|
||||
# Configuration of the "rpc" context for file
|
||||
#rpc.class=org.apache.hadoop.metrics.file.FileContext
|
||||
#rpc.period=10
|
||||
#rpc.fileName=/tmp/rpcmetrics.log
|
||||
|
||||
# Configuration of the "rpc" context for ganglia
|
||||
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
|
||||
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# rpc.period=10
|
||||
# rpc.servers=localhost:8649
|
||||
|
||||
|
||||
# Configuration of the "ugi" context for null
|
||||
ugi.class=org.apache.hadoop.metrics.spi.NullContext
|
||||
|
||||
# Configuration of the "ugi" context for file
|
||||
#ugi.class=org.apache.hadoop.metrics.file.FileContext
|
||||
#ugi.period=10
|
||||
#ugi.fileName=/tmp/ugimetrics.log
|
||||
|
||||
# Configuration of the "ugi" context for ganglia
|
||||
# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext
|
||||
# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# ugi.period=10
|
||||
# ugi.servers=localhost:8649
|
||||
|
||||
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# dfs.period=10
|
||||
# dfs.servers={{ hdfs_namenode_1_hostname }}:{{ ganglia_gmond_namenode_port }}
|
||||
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# mapred.period=10
|
||||
# mapred.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
# hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# hbase.period=10
|
||||
# hbase.servers={{ hbase_master_1_hostname }}:{{ ganglia_gmond_cluster_port }}
|
||||
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# jvm.period=10
|
||||
# jvm.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# rpc.period=10
|
||||
# rpc.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# ugi.period=10
|
||||
# ugi.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
# fairscheduler.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
|
||||
# fairscheduler.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
|
@ -0,0 +1,34 @@
|
||||
# Ganglia 3.1+ support
|
||||
*.period=60
|
||||
|
||||
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
|
||||
*.sink.ganglia.period=10
|
||||
# default for supportsparse is false
|
||||
*.sink.ganglia.supportsparse=true
|
||||
*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both
|
||||
*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40
|
||||
|
||||
|
||||
namenode.sink.ganglia.servers={{ hdfs_namenode_1_hostname }}:{{ ganglia_gmond_namenode_port }},{{ hdfs_namenode_2_hostname }}:{{ ganglia_gmond_namenode_port }}
|
||||
datanode.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
jobtracker.sink.ganglia.servers={{ jobtracker_node_1_hostname }}:{{ ganglia_gmond_jobtracker_port }},{{ jobtracker_node_2_hostname }}:{{ ganglia_gmond_jobtracker_port }}
|
||||
#tasktracker.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_tasktracker_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_tasktracker_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_tasktracker_port }}
|
||||
#maptask.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_maptask_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_maptask_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_maptask_port }}
|
||||
#reducetask.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_reducetask_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_reducetask_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_reducetask_port }}
|
||||
tasktracker.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
maptask.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
reducetask.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
hbase.extendedperiod = 3600
|
||||
hbase.sink.ganglia.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
hbase.servers=node2.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node5.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }},node11.{{ dns_domain }}:{{ ganglia_gmond_hdfs_datanodes_port }}
|
||||
#hbase.sink.ganglia.servers={{ hbase_master_1_hostname }}:{{ ganglia_gmond_hbmaster_port }},{{ hbase_master_2_hostname }}:{{ ganglia_gmond_hbmaster_port }}
|
||||
#hbase.servers={{ hbase_master_1_hostname }}:{{ ganglia_gmond_hbmaster_port }},{{ hbase_master_2_hostname }}:{{ ganglia_gmond_hbmaster_port }}
|
||||
|
||||
#resourcemanager.sink.ganglia.servers=
|
||||
#nodemanager.sink.ganglia.servers=
|
||||
#historyserver.sink.ganglia.servers=
|
||||
#journalnode.sink.ganglia.servers=
|
||||
#nimbus.sink.ganglia.servers=
|
||||
#supervisor.sink.ganglia.servers=
|
||||
#resourcemanager.sink.ganglia.tagsForPrefix.yarn=Queue
|
||||
|
@ -0,0 +1,43 @@
|
||||
---
|
||||
- name: Directory for hdfs root under /data
|
||||
file: dest={{ hdfs_data_dir }} state=directory
|
||||
tags:
|
||||
- hadoop
|
||||
- mapred
|
||||
- hdfs
|
||||
|
||||
# TODO: split and move to more specific roles.
|
||||
- name: Directories for the hdfs services
|
||||
file: dest={{ hdfs_data_dir}}/{{ item }} state=directory owner=hdfs group=hdfs mode=700
|
||||
with_items:
|
||||
- '{{ hdfs_dn_data_dir }}'
|
||||
- '{{ hdfs_journal_data_dir }}'
|
||||
tags:
|
||||
- hadoop
|
||||
- mapred
|
||||
- hdfs
|
||||
|
||||
- name: Directories for mapred under /data/mapred
|
||||
file: dest=/data/mapred state=directory
|
||||
tags:
|
||||
- hadoop
|
||||
- mapred
|
||||
- hdfs
|
||||
|
||||
- name: Directories for mapred under /data/mapred
|
||||
file: dest=/data/mapred/{{ item }} state=directory owner=mapred group=hadoop mode=700
|
||||
with_items:
|
||||
- jt
|
||||
- local
|
||||
tags:
|
||||
- hadoop
|
||||
- mapred
|
||||
- hdfs
|
||||
|
||||
- name: JMX secrets directory
|
||||
file: dest=/etc/hadoop-jmx/conf state=directory owner=hdfs group=root mode=0750
|
||||
when: hadoop_jmx_enabled
|
||||
tags:
|
||||
- hadoop
|
||||
- jmx
|
||||
|
@ -0,0 +1,8 @@
|
||||
node13.t.hadoop.research-infrastructures.eu
|
||||
node12.t.hadoop.research-infrastructures.eu
|
||||
node11.t.hadoop.research-infrastructures.eu
|
||||
node10.t.hadoop.research-infrastructures.eu
|
||||
node9.t.hadoop.research-infrastructures.eu
|
||||
node8.t.hadoop.research-infrastructures.eu
|
||||
node7.t.hadoop.research-infrastructures.eu
|
||||
node6.t.hadoop.research-infrastructures.eu
|
@ -0,0 +1,39 @@
|
||||
---
|
||||
- name: Restart HDFS namenode
|
||||
service: name=hadoop-hdfs-namenode state=restarted sleep=20
|
||||
ignore_errors: true
|
||||
|