Commit Graph

3027 Commits

Author SHA1 Message Date
dimitrispie 686580a220 - New Monitor DB workflow
- New Organization added
2023-01-12 11:18:03 +02:00
dimitrispie becb242c17 Monitor DB only Workflow 2023-01-04 16:50:29 +02:00
dimitrispie dcb958e146 Changes to execute the stats wf only in hive 2023-01-04 11:39:01 +02:00
dimitrispie 592013d5dd Added more steps in decision node 2022-12-23 09:43:16 +02:00
dimitrispie 2a4bf32d4c Merge branch 'hive' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into hive
# Conflicts:
#	dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step10.sql
#	dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step13.sql
#	dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step14.sql
#	dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step16_1-definitions.sql
#	dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step7.sql
2022-12-22 10:22:46 +02:00
dimitrispie 6449ff4207 1. Added a decision node to enables the workflow to make a selection on the execution path to follow
2. Added new organization
3. Added 5 new tables from Eurostast
2022-12-22 10:18:21 +02:00
Antonis Lempesis c8309fe18e addded command line params to allow hive actions to run 2022-12-21 12:41:33 +02:00
Antonis Lempesis 028873cc51 added new hive opts 2022-12-21 12:41:33 +02:00
Antonis Lempesis 1ddea4f442 removed 'stored as parquet' from views.. 2022-12-21 12:41:33 +02:00
Antonis Lempesis 2754c3dd62 moving data to impala cluster and creating shadow databases there 2022-12-21 12:41:29 +02:00
Antonis Lempesis 778a1a724f finished migration to hive only 2022-12-21 12:41:25 +02:00
Antonis Lempesis e84dd5fe26 first 2022-12-21 12:41:23 +02:00
dimitrispie 2a52a42169 Added 4 institutions:
-University of Modena and Reggio Emilia
-Bilkent University
-Saints Cyril and Methodius University of Skopje
-University of Milan
2022-12-06 10:10:21 +02:00
dimitrispie 992fc5b628 Added McMaster University Institution 2022-11-03 11:02:18 +02:00
dimitrispie 7fda05e380 Added Autonomous University of Barcelona 2022-11-01 13:59:40 +02:00
dimitrispie 7861c472e0 Hive memory parameters 2022-10-28 19:00:32 +03:00
dimitrispie 5df9c63963 Added fields: totalcost, fundedamount, currency, in project table 2022-10-27 16:44:26 +03:00
dimitrispie 2c0c3f1806 Cast amount to float for table result_apcs 2022-09-28 19:33:24 +03:00
dimitrispie bdc46e3eaa Remove denormalization of results to fix downloads numbers in monitor 2022-09-28 14:59:08 +03:00
dimitrispie 2ebb1459a9 Fixed type in no_downloads 2022-09-28 14:36:57 +03:00
dimitrispie dcd85f8cd7 - Synchronize indicators in stats-db with monitor-db
- added new openorg id for Nanyang Technological University
- changed openorg id for University of Helsinki #8088 ticket
2022-09-22 13:33:07 +03:00
dimitrispie 3bf3127251 Changes to monitor and indicator scripts 2022-09-14 16:36:19 +03:00
dimitrispie 71b069ca90 Changes to indicator and monitor scripts 2022-09-09 13:15:58 +03:00
dimitrispie 2b5f8c9c9a comment out duplicate table creation 2022-09-06 12:27:53 +03:00
Antonis Lempesis fcef5294e2 restored some collab indicators 2022-08-05 13:45:01 +03:00
Antonis Lempesis 227e10f4b3 commenting out the collab indicators because they still fail 2022-08-05 12:54:36 +03:00
Antonis Lempesis 8b0407d8ec fixed the datasourceOrganization relations 2022-08-03 12:26:59 +03:00
Antonis Lempesis 1778d40c40 latest version of indicators 2022-08-02 13:39:34 +03:00
Antonis Lempesis 6fc9ef53f6 addded command line params to allow hive actions to run 2022-07-29 16:36:20 +03:00
Antonis Lempesis 9886fe87ec - Added FOS classification
- Added extra orgs in monitor
- Fixed result-project and organization-project tables
2022-07-29 16:34:50 +03:00
Antonis Lempesis ab18c9daa9 Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta 2022-06-09 15:48:21 +03:00
Antonis Lempesis 574492c659 removed double result_apc table creation from monitor 2022-06-09 15:48:13 +03:00
Antonis Lempesis db088cc69c fixed *_organization tables 2022-06-07 04:04:28 +03:00
Antonis Lempesis 3fc9efeab6 fixed typo, addded open citations and apcs in monitor 2022-05-13 14:28:13 +03:00
Antonis Lempesis 23334479bb removed yet another collab, added more orgs in monitor 2022-05-11 13:05:52 +03:00
Antonis Lempesis 61b4c19e65 restored indi_result_org_country_collab, removed indi_result_org_collab 2022-05-06 12:52:10 +03:00
Antonis Lempesis cfbbcaf7c4 commented out indi_result_org_country_collab 2022-05-06 12:49:36 +03:00
Antonis Lempesis 0353f93d54 added new hive opts 2022-04-29 12:49:27 +03:00
Antonis Lempesis b7cd2c6ca1 added open citations 2022-04-20 14:46:55 +03:00
Antonis Lempesis c442c91f89 computing stats in each step 2022-04-06 12:40:02 +03:00
Antonis Lempesis 7112806a73 views cannot be stored as parquet... 2022-03-29 16:37:29 +03:00
Antonis Lempesis fff0b3cc19 added apcs in monitor db 2022-03-29 14:15:31 +03:00
Antonis Lempesis ee24f3eb2c views cannot be stored as parquet... 2022-03-29 13:47:48 +03:00
Antonis Lempesis d8503cd191 added moooar organizations 2022-03-24 14:02:36 +02:00
Antonis Lempesis 62f91b0869 cleanup 2022-03-22 16:17:49 +02:00
Antonis Lempesis 2e8394ecf8 creating aaall tables as parquet 2022-03-22 16:16:08 +02:00
Antonis Lempesis dcfbeb8142 yet more typos 2022-03-21 12:36:03 +02:00
Antonis Lempesis ad78e505da yet another fix 2022-03-03 12:28:12 +02:00
Antonis Lempesis efeeebfee1 fixed query after the change in the indicator table 2022-03-02 13:29:25 +02:00
Antonis Lempesis 3b92a2ab9c added the rest of spring 6 in monitor db 2022-02-23 12:05:57 +02:00
Antonis Lempesis 87c91f70a2 added sprint 6 indicators to monitor db 2022-02-22 14:41:48 +02:00
dimitrispie 58c59f46eb Added Sprint 6 2022-02-17 10:21:09 +02:00
Antonis Lempesis 5772f92dba merged beta chnages in hive branch 2022-02-15 13:24:51 +02:00
Antonis Lempesis 393a4ee956 fixed yet another typo... 2022-02-15 12:56:50 +02:00
Antonis Lempesis 5f762cbd09 fixed yet another typo 2022-02-07 12:09:12 +02:00
Antonis Lempesis ae633c566b fixed the result_result table 2022-02-04 15:04:19 +02:00
Antonis Lempesis c2b44530a3 typo... 2022-02-03 13:44:07 +02:00
Antonis Lempesis dbd2646d59 fixed the result_result creation for monitor 2022-02-03 12:37:10 +02:00
Antonis Lempesis 81ee654271 added result_result relations 2021-12-23 15:46:17 +02:00
Antonis Lempesis 7551e52e95 fixed a typo 2021-12-23 15:33:53 +02:00
Antonis Lempesis 16539d7360 added usage stats 2021-12-22 02:54:42 +02:00
Antonis Lempesis 3edd661608 fixed column names 2021-12-21 22:55:04 +02:00
Antonis Lempesis a4c0cbb98c fixed typos in indicators. Added extra views in monitor 2021-12-21 15:54:38 +02:00
Antonis Lempesis 58996972d9 added first indicator of sprint 5 2021-12-21 03:35:04 +02:00
dimitrispie c1cdec09a9 Sprint 5 and other changes 2021-12-20 19:23:57 +02:00
Antonis Lempesis ddd34087c2 removed 'stored as parquet' from views.. 2021-12-13 23:05:00 +02:00
Antonis Lempesis 915f758c82 moving data to impala cluster and creating shadow databases there 2021-12-13 16:26:14 +02:00
Antonis Lempesis d05210ba99 finished migration to hive only 2021-11-30 19:01:48 +02:00
dimitrispie 09fc2afdca Added indi_funder_country_collab
Kept only indi_pub_has_cc_licence
2021-11-26 16:13:10 +02:00
Antonis Lempesis 0b4163ee0b added sprint3,4, removed 2, chaos 2021-11-26 15:58:01 +02:00
Antonis Lempesis 12749a0a77 first 2021-11-26 15:40:40 +02:00
dimitrispie 29f69f2f89 Sprint 4 2021-11-26 15:22:04 +02:00
Antonis Lempesis cb3adb90f4 Merge branch 'beta' into beta 2021-11-17 14:33:45 +01:00
Antonis Lempesis c283406829 added Universidad Polytecnica de Madrid 2021-11-17 15:33:00 +02:00
Claudio Atzori e0395719d7 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2021-11-17 14:17:27 +01:00
Claudio Atzori 82a4e4efae [cleaning wf] fixed methodology to rule out invalid result titles, based on https://support.openaire.eu/issues/7206 2021-11-17 14:17:22 +01:00
Miriam Baglioni 6d4a1c57ee [Resolve Entities] Change test dataset to mirror the modification in the creation of the map between the pids and the unresolved 2021-11-17 12:41:52 +01:00
Claudio Atzori 0a727d325d [dedup] increased number of partitions in the consistency phase 2021-11-16 08:43:41 +01:00
Claudio Atzori bafa2990f3 code formatting 2021-11-15 17:07:16 +01:00
Claudio Atzori 668ac25224 [graph resolution] using existing argument parser file name 2021-11-15 17:02:45 +01:00
Claudio Atzori 7d0a03f607 [graph resolution] minor 2021-11-15 14:45:54 +01:00
Claudio Atzori 941a50a2fc Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2021-11-15 14:42:49 +01:00
Claudio Atzori 7c804acda8 [graph resolution] minor 2021-11-15 14:42:43 +01:00
Sandro La Bruzzo efa09057db Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta 2021-11-15 14:32:09 +01:00
Sandro La Bruzzo 48923e46a1 added documentation to Pubmed Class and also added mvn site for dhp-aggregations 2021-11-15 14:32:01 +01:00
Claudio Atzori d2c787d416 [graph resolution] fixed sequence of the workflow steps 2021-11-15 14:31:15 +01:00
Claudio Atzori 975b10b711 [actionmanager] increased spark.sql.shuffle.partitions to 5000 2021-11-15 12:31:45 +01:00
Miriam Baglioni 4ec88c718c merge with beta - resolved conflict in pom 2021-11-15 10:52:16 +01:00
Miriam Baglioni 6f1a434e90 [Bypass Action Set] Fixed test to consider the new identifier utils 2021-11-15 09:59:23 +01:00
Miriam Baglioni 157d33ebf9 [Bypass Action Set] Refactoring 2021-11-15 09:58:48 +01:00
Miriam Baglioni 92d0e18b55 [Bypass Action Set] used constant DOI instead of "doi" 2021-11-12 10:56:58 +01:00
Miriam Baglioni 881113743f [Bypass Action Set] refactoring 2021-11-12 10:55:50 +01:00
Miriam Baglioni 47ccb53c4f [Bypass Action Set] modification for comment D-Net/dnet-hadoop#157 (comment) 2021-11-12 10:54:09 +01:00
Miriam Baglioni ffb0ce1d59 merge with beta - resolved conflict in pom 2021-11-12 10:19:59 +01:00
Miriam Baglioni 716021546e [Bypass Action Set] minor fix 2021-11-12 10:18:01 +01:00
Sandro La Bruzzo 3469cc2b1d Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta 2021-11-12 09:56:52 +01:00
Sandro La Bruzzo a7763d2492 removed alternate identifier in resolutionMap 2021-11-12 09:56:45 +01:00
Miriam Baglioni 935062edec [Bypass Action Set] creation of unresolved entities 2021-11-11 16:11:25 +01:00
Antonis Lempesis 26f086dd64 removed the too restrctive clause. will discuss again 2021-11-11 12:57:19 +02:00
Claudio Atzori 148289150f Merge branch 'beta' into doiboost_url 2021-11-11 10:40:19 +01:00