1
0
Fork 0
Commit Graph

11 Commits

Author SHA1 Message Date
Claudio Atzori 013935c593 Merge pull request 'Improvements to copying data from ocean to impala' (#420) from antonis.lempesis/dnet-hadoop:beta into master
Reviewed-on: D-Net/dnet-hadoop#420
2024-04-16 14:17:47 +02:00
Lampros Smyrnaios d7da4f814b Minor updates to the copying operation to Impala Cluster:
- Improve logging.
- Code optimization/polishing.
2024-04-12 18:12:06 +03:00
Lampros Smyrnaios 14719dcd62 Miscellaneous updates to the copying operation to Impala Cluster:
- Update the algorithm for creating views that depend on other views.
- Add check for successful execution of the "hadoop distcp" command.
- Add a check for successful copy operation of all entities.
- Upon facing an error in a DB, exit the method, instead of the whole script.
- Improve logging.
- Code polishing.
2024-04-12 15:36:13 +03:00
Lampros Smyrnaios abf0b69f29 Upgrade the copying operation to Impala Cluster:
- Use only hive commands in the Ocean Cluster, as the "impala-shell" will be removed from there to free-up resources.
- Hugely improve the performance in every aspect of the copying process: a) speedup file-transferring and DB-deletion, b) eliminate permissions-assignment, "load" operations and "use $db" queries, c) retry only the "create view" statements and only as long as they depend on other non-created views, instead of trying to recreate all tables and views 5 consecutive times.
- Add error-checks for the creation of tables and views.
2024-04-11 17:12:12 +03:00
Claudio Atzori 5add51f38c Merge pull request 'fixed the result_country definition and updated the stats DB copy procedure' (#412) from antonis.lempesis/dnet-hadoop:beta into master
Reviewed-on: D-Net/dnet-hadoop#412
2024-04-03 12:34:17 +02:00
Lampros Smyrnaios b7c8acc563 - Update the code which acquires the "IMPALA_HDFS_NODE", to test the "tmp"-dir, instead of the base-dir and introduce retries, to overcome potential file-system failures. This change was suggested by "Sebastian Tymkow" and "Grzegorz Bakalarski".
- Fix typos.
2024-04-03 13:15:37 +03:00
Claudio Atzori d16c15da8d adjusted pom files 2024-03-26 14:00:44 +01:00
Lampros Smyrnaios bc8c97182d Automatically select the ACTIVE HDFS NODE for Impala cluster, in all "copyDataToImpalaCluster.sh" scripts. 2024-03-26 13:01:12 +02:00
Antonis Lempesis 3c79720342 fixed the irish result subset 2024-03-07 14:08:57 +02:00
dimitrispie 6b823100ae Update buildIrishMonitorDB.sql
New indicators added
2024-01-07 22:54:39 +02:00
dimitrispie ffdd03d2f4 Monitor Irish Stats WF
Parameters (with examples):
stats_db_name=openaire_beta_stats_20231208
monitor_irish_db_name=openaire_beta_stats_monitor_ie_20231208b
monitor_irish_db_prod_name=openaire_beta_stats_monitor_ie
graph_db_name=openaire_beta_20231208
monitor_irish_db_shadow_name=openaire_beta_stats_monitor_ie_shadow
hive_timeout=150000
hadoop_user_name=dnet.beta
resumeFrom=Step1-buildIrishMonitorDB
2023-12-22 11:05:24 +02:00