Commit Graph

342 Commits

Author SHA1 Message Date
dimitrispie dcd85f8cd7 - Synchronize indicators in stats-db with monitor-db
- added new openorg id for Nanyang Technological University
- changed openorg id for University of Helsinki #8088 ticket
2022-09-22 13:33:07 +03:00
Claudio Atzori 465e941214 Merge pull request '[stats wf] Changes to indicators tables' (#244) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #244
2022-09-16 10:13:58 +02:00
dimitrispie 3bf3127251 Changes to monitor and indicator scripts 2022-09-14 16:36:19 +03:00
dimitrispie 71b069ca90 Changes to indicator and monitor scripts 2022-09-09 13:15:58 +03:00
dimitrispie 2b5f8c9c9a comment out duplicate table creation 2022-09-06 12:27:53 +03:00
Claudio Atzori 84598c7535 Merge pull request 'restored some collab indicators' (#240) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #240
2022-08-05 15:50:39 +02:00
Antonis Lempesis fcef5294e2 restored some collab indicators 2022-08-05 13:45:01 +03:00
Claudio Atzori c1f2ffc53d Merge pull request 'commenting out the collab indicators because they still fail' (#237) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #237
2022-08-05 11:57:36 +02:00
Antonis Lempesis 227e10f4b3 commenting out the collab indicators because they still fail 2022-08-05 12:54:36 +03:00
Claudio Atzori efd96e7e66 Merge pull request 'fixed the datasourceOrganization relations' (#233) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #233
2022-08-03 12:25:05 +02:00
Antonis Lempesis 8b0407d8ec fixed the datasourceOrganization relations 2022-08-03 12:26:59 +03:00
Claudio Atzori 27681cf6bf Merge pull request '[stats wf] latest version of indicators + added FOS classification' (#232) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #232
2022-08-02 12:57:15 +02:00
Antonis Lempesis 1778d40c40 latest version of indicators 2022-08-02 13:39:34 +03:00
Antonis Lempesis 6fc9ef53f6 addded command line params to allow hive actions to run 2022-07-29 16:36:20 +03:00
Antonis Lempesis 9886fe87ec - Added FOS classification
- Added extra orgs in monitor
- Fixed result-project and organization-project tables
2022-07-29 16:34:50 +03:00
Miriam Baglioni b229c6e7af Merge pull request 'beta' (#218) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #218
2022-06-10 11:03:48 +02:00
Antonis Lempesis ab18c9daa9 Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta 2022-06-09 15:48:21 +03:00
Antonis Lempesis 574492c659 removed double result_apc table creation from monitor 2022-06-09 15:48:13 +03:00
Antonis Lempesis db088cc69c fixed *_organization tables 2022-06-07 04:04:28 +03:00
Claudio Atzori 5c2949a864 Merge pull request '[stats wf] added open citations & more orgs in monitor, removed collab indicator' (#213) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #213
2022-05-20 11:38:43 +02:00
Antonis Lempesis 3fc9efeab6 fixed typo, addded open citations and apcs in monitor 2022-05-13 14:28:13 +03:00
Antonis Lempesis 23334479bb removed yet another collab, added more orgs in monitor 2022-05-11 13:05:52 +03:00
Antonis Lempesis 61b4c19e65 restored indi_result_org_country_collab, removed indi_result_org_collab 2022-05-06 12:52:10 +03:00
Antonis Lempesis cfbbcaf7c4 commented out indi_result_org_country_collab 2022-05-06 12:49:36 +03:00
Antonis Lempesis 0353f93d54 added new hive opts 2022-04-29 12:49:27 +03:00
Antonis Lempesis b7cd2c6ca1 added open citations 2022-04-20 14:46:55 +03:00
Claudio Atzori 4eff7856f5 Merge pull request '[stats-wf] computing stats in each step' (#210) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #210
2022-04-08 14:21:01 +02:00
Claudio Atzori c26222623f [maven-release-plugin] prepare for next development iteration 2022-04-07 13:32:22 +02:00
Claudio Atzori 86585a6b27 [maven-release-plugin] prepare release dhp-1.2.4 2022-04-07 13:32:19 +02:00
Claudio Atzori ad85d88eaf [maven-release-plugin] rollback the release of dhp-1.2.4 2022-04-07 13:28:35 +02:00
Claudio Atzori 598e11dfd7 [maven-release-plugin] prepare for next development iteration 2022-04-07 13:27:02 +02:00
Claudio Atzori db3d9877a5 [maven-release-plugin] prepare release dhp-1.2.4 2022-04-07 13:26:58 +02:00
Claudio Atzori 3bba6d6e38 [maven-release-plugin] rollback the release of dhp-1.2.4 2022-04-07 12:23:17 +02:00
Claudio Atzori 2ac2d928bd [maven-release-plugin] prepare for next development iteration 2022-04-07 12:18:47 +02:00
Claudio Atzori 85bc722ff4 [maven-release-plugin] prepare release dhp-1.2.4 2022-04-07 12:18:43 +02:00
Claudio Atzori bc05b6168a [maven-release-plugin] rollback the release of dhp-1.2.4 2022-04-07 11:49:06 +02:00
Claudio Atzori 505420fd61 [maven-release-plugin] prepare for next development iteration 2022-04-07 11:34:06 +02:00
Claudio Atzori 66e718981e [maven-release-plugin] prepare release dhp-1.2.4 2022-04-07 11:34:02 +02:00
Antonis Lempesis c442c91f89 computing stats in each step 2022-04-06 12:40:02 +03:00
Antonis Lempesis 7112806a73 views cannot be stored as parquet... 2022-03-29 16:37:29 +03:00
Antonis Lempesis fff0b3cc19 added apcs in monitor db 2022-03-29 14:15:31 +03:00
Antonis Lempesis ee24f3eb2c views cannot be stored as parquet... 2022-03-29 13:47:48 +03:00
Antonis Lempesis d8503cd191 added moooar organizations 2022-03-24 14:02:36 +02:00
Antonis Lempesis 62f91b0869 cleanup 2022-03-22 16:17:49 +02:00
Antonis Lempesis 2e8394ecf8 creating aaall tables as parquet 2022-03-22 16:16:08 +02:00
Antonis Lempesis dcfbeb8142 yet more typos 2022-03-21 12:36:03 +02:00
Antonis Lempesis ad78e505da yet another fix 2022-03-03 12:28:12 +02:00
Antonis Lempesis efeeebfee1 fixed query after the change in the indicator table 2022-03-02 13:29:25 +02:00
Antonis Lempesis 3b92a2ab9c added the rest of spring 6 in monitor db 2022-02-23 12:05:57 +02:00
Antonis Lempesis 87c91f70a2 added sprint 6 indicators to monitor db 2022-02-22 14:41:48 +02:00
dimitrispie 58c59f46eb Added Sprint 6 2022-02-17 10:21:09 +02:00
Antonis Lempesis 5772f92dba merged beta chnages in hive branch 2022-02-15 13:24:51 +02:00
Antonis Lempesis 393a4ee956 fixed yet another typo... 2022-02-15 12:56:50 +02:00
Antonis Lempesis 5f762cbd09 fixed yet another typo 2022-02-07 12:09:12 +02:00
Antonis Lempesis ae633c566b fixed the result_result table 2022-02-04 15:04:19 +02:00
Antonis Lempesis c2b44530a3 typo... 2022-02-03 13:44:07 +02:00
Antonis Lempesis dbd2646d59 fixed the result_result creation for monitor 2022-02-03 12:37:10 +02:00
Antonis Lempesis 81ee654271 added result_result relations 2021-12-23 15:46:17 +02:00
Antonis Lempesis 7551e52e95 fixed a typo 2021-12-23 15:33:53 +02:00
Antonis Lempesis 16539d7360 added usage stats 2021-12-22 02:54:42 +02:00
Antonis Lempesis 3edd661608 fixed column names 2021-12-21 22:55:04 +02:00
Antonis Lempesis a4c0cbb98c fixed typos in indicators. Added extra views in monitor 2021-12-21 15:54:38 +02:00
Antonis Lempesis 58996972d9 added first indicator of sprint 5 2021-12-21 03:35:04 +02:00
dimitrispie c1cdec09a9 Sprint 5 and other changes 2021-12-20 19:23:57 +02:00
Antonis Lempesis ddd34087c2 removed 'stored as parquet' from views.. 2021-12-13 23:05:00 +02:00
Antonis Lempesis 915f758c82 moving data to impala cluster and creating shadow databases there 2021-12-13 16:26:14 +02:00
Antonis Lempesis d05210ba99 finished migration to hive only 2021-11-30 19:01:48 +02:00
dimitrispie 09fc2afdca Added indi_funder_country_collab
Kept only indi_pub_has_cc_licence
2021-11-26 16:13:10 +02:00
Antonis Lempesis 0b4163ee0b added sprint3,4, removed 2, chaos 2021-11-26 15:58:01 +02:00
Antonis Lempesis 12749a0a77 first 2021-11-26 15:40:40 +02:00
dimitrispie 29f69f2f89 Sprint 4 2021-11-26 15:22:04 +02:00
Antonis Lempesis cb3adb90f4 Merge branch 'beta' into beta 2021-11-17 14:33:45 +01:00
Antonis Lempesis c283406829 added Universidad Polytecnica de Madrid 2021-11-17 15:33:00 +02:00
Antonis Lempesis 26f086dd64 removed the too restrctive clause. will discuss again 2021-11-11 12:57:19 +02:00
Antonis Lempesis 91354c6068 - fetching all context related results
- storing tables as parquet
2021-11-08 15:15:46 +02:00
Claudio Atzori 7fa49f6956 Merge pull request 'removed hardcoded reference' (#154) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #154
2021-11-02 09:11:30 +01:00
Antonis Lempesis f78afb5ef9 removed hardcoded reference 2021-11-01 15:42:29 +02:00
Claudio Atzori 4f8970f8ed [stats] reducing the step22 wait time 2021-10-20 14:14:53 +02:00
Antonis Lempesis 241dcf6df1 Merge branch 'beta' into beta 2021-10-19 23:54:21 +02:00
Antonis Lempesis 41ecb1eb61 invalidating medatadata before context thingies 2021-10-15 13:42:55 +03:00
Antonis Lempesis 4b7c8dff2d fetching affiliated results for 4 orgs in monitor. fixed affiliated orgs in stats db 2021-10-14 18:53:35 +03:00
Claudio Atzori b292e4a700 [stats wf] added extra logging in the context data retrieval phase 2021-10-13 17:31:53 +02:00
dimitrispie 3f25d2efb2 Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta 2021-10-01 16:03:48 +03:00
dimitrispie 13687fd887 Sprint 3 indicators update 2021-10-01 16:02:02 +03:00
Antonis Lempesis a1e1cf32d7 fixed an impala error 2021-09-24 12:57:24 +03:00
Antonis Lempesis f358cabb2b fixed typo 2021-09-22 21:50:37 +03:00
Antonis Lempesis 421d55265d created hive action for observatory queries 2021-09-21 03:07:58 +03:00
Antonis Lempesis 8b681dcf1b attempt to make the observatory wf run in hive 2021-09-18 00:35:14 +03:00
Antonis Lempesis 2943287d10 fixed the definition of cc_licence, part II 2021-09-16 15:59:06 +03:00
Antonis Lempesis dd2329849f fixed the definition of cc_licence 2021-09-16 13:50:34 +03:00
Antonis Lempesis de9bf3a161 added cc_licences and abstracts in observatory db 2021-09-14 01:29:08 +03:00
Antonis Lempesis 9b1936701c fixed yet another typo 2021-09-13 21:07:44 +03:00
Antonis Lempesis 8fc89ae822 moved context table creation before indicators 2021-09-13 14:33:23 +03:00
Antonis Lempesis 461bf90ca6 fixed the gold_oa definition 2021-09-13 11:10:30 +03:00
Antonis Lempesis 43852bac0e creating other::other concept for all contexts 2021-09-13 01:36:41 +03:00
Antonis Lempesis f13cca7e83 moved dependencies of indicators before them... 2021-09-08 23:07:58 +03:00
Antonis Lempesis c6ada217a1 fixed typo 2021-09-08 22:34:59 +03:00
Antonis Lempesis 1250ae197f using new indicators for the definition of peerreviewed, gold, and green 2021-09-08 14:08:43 +03:00
Antonis Lempesis ccee451dde added indicators of sprint 2 in monitor db 2021-09-07 23:17:13 +03:00
Antonis Lempesis 117c3d5c67 fixed a typo 2021-08-02 12:15:58 +03:00
Antonis Lempesis 26af0320d0 added the sprint 2 indicators in monitor db 2021-07-30 00:31:33 +03:00
Antonis Lempesis 4afa5215a9 fixed a NPE? 2021-07-28 21:59:12 +03:00
Antonis Lempesis 3d1580fa9b fixed a typo 2021-07-28 18:50:31 +03:00
Antonis Lempesis 9b181ffa73 added the h2020 classification scheme for projects 2021-07-28 16:31:29 +03:00
Antonis Lempesis 4a9741825d added result_orcid, result_project provenance, issn in datasources 2021-07-28 12:28:04 +03:00
Antonis Lempesis 1a28a69cac changed the citeee in *_citations to cites 2021-07-27 15:14:09 +03:00
Antonis Lempesis ed185fd7ed added missing colons 2021-07-27 11:42:47 +03:00
Antonis Lempesis f3b9570354 properly invalidating metadata 2021-07-26 13:00:16 +03:00
Antonis Lempesis f9fbb0f261 added indicators second sprint 2021-07-24 16:40:28 +03:00
Antonis Lempesis 89e6f46682 using organization ids instead of names in monitor db creation 2021-07-05 12:00:00 +03:00
Antonis Lempesis 87f14a3899 added the missing indicators files 2021-06-29 16:31:51 +03:00
Antonis Lempesis 018c4eb52c copied latest changes from old fork: indicators+monitor institutions 2021-06-28 23:46:52 +03:00
Antonis Lempesis f7c0b80e35 storing result_instance as parquet 2021-06-15 14:45:48 +03:00
Antonis Lempesis d413b24611 added instances, orgs for monitor, totalcost for projects, apcs 2021-06-10 02:35:46 +03:00
Antonis Lempesis 168edcbde3 added the final steps for the observatory promote wf and some cleanup 2021-05-18 15:23:20 +03:00
Antonis Lempesis 625d993cd9 added step for observatory db 2021-04-20 02:31:06 +03:00
Antonis Lempesis 25d0512fbd code cleanup 2021-04-20 01:43:23 +03:00
Antonis Lempesis 03d36fadea properly invalidating impala metadata 2021-04-15 13:34:22 +03:00
Antonis Lempesis 236435b470 following redirects 2021-03-12 14:11:21 +02:00
Antonis Lempesis 3c75a05044 fixed a ton of typos 2021-03-12 13:47:04 +02:00
Antonis Lempesis fa1ec5b5e9 fixed typo... 2021-03-10 14:05:58 +02:00
Antonis Lempesis f40c150a0d fixed steps... 2021-03-06 00:35:57 +02:00
Antonis Lempesis 6147ee4950 assigning correctly hive contexts to concepts 2021-03-05 14:12:18 +02:00
Antonis Lempesis c5fbad8093 Contexts are now downloaded instead of using the stats_ext db 2021-03-04 00:42:21 +02:00
Antonis Lempesis 27796343ca crude sleep. hardcoded value 2021-03-03 01:37:47 +02:00
Antonis Lempesis d90767c733 correctly invalidating metadata 2021-02-19 03:18:47 +02:00
Antonis Lempesis 3681afbe04 typo 2021-02-19 03:04:27 +02:00
Antonis Lempesis c5502eba8f actually moved stats computation in impala instead of hive... 2021-02-19 02:54:39 +02:00
Antonis Lempesis 33c85d4e66 moved stats computation in impala instead of hive 2021-02-18 17:23:34 +02:00
Antonis Lempesis b8e96c8ae7 moved cache update to the end 2021-02-18 16:42:22 +02:00
Antonis Lempesis bcbfc052b1 fixed last errors in step 21 2021-02-18 16:32:54 +02:00
Antonis Lempesis 10a29a4b9a fixes in monitor step 2021-02-18 15:05:59 +02:00
Antonis Lempesis 8ef66452d5 fixed typo 2021-02-17 22:24:44 +02:00
Antonis Lempesis a8836e2f5f fixed typo 2021-02-17 19:27:07 +02:00
Antonis Lempesis a445c1ac3d fixed variable names in monitor script 2021-02-17 16:45:09 +02:00
Antonis Lempesis 00d516360f added missing ; 2021-02-17 16:41:10 +02:00
Antonis Lempesis cd1b794409 added the monitor db wf 2021-02-17 02:11:55 +02:00
Antonis Lempesis 1c029b9fc0 fixed formatting 2021-02-14 03:14:24 +02:00
Antonis Lempesis 2c4dcc90ba analyzing tables to produce stats 2021-02-14 02:54:55 +02:00
Antonis Lempesis be5969a8c2 Changed typo in script names 2020-12-22 13:33:32 +02:00
Antonis Lempesis 2a074c3b2b Changed typo in script names 2020-12-18 18:40:48 +02:00
Antonis Lempesis 7cb113e088 added the new parameter (stats_tool_api_url) in the workflow parameters 2020-12-04 13:04:25 +02:00
Antonis Lempesis d23ccae0d5 ignoring deletedbyinference relations 2020-12-04 12:42:17 +02:00
Antonis Lempesis 413afcfed5 finished first implementation of wf 2020-12-02 15:57:17 +02:00
Antonis Lempesis 815d6b25d9 added last step to update cache 2020-11-30 00:48:10 +02:00
Antonis Lempesis 01a6e03989 starting from first step... 2020-11-17 23:26:47 +02:00
Antonis Lempesis 99ebaee347 fixed #5913 2020-11-11 16:56:46 +02:00
Antonis Lempesis f14e65f6a3 reverted wrong change 2020-11-10 17:23:04 +02:00
Antonis Lempesis c02c7741c9 fixes in db creation 2020-11-10 17:11:30 +02:00
Antonis Lempesis e603fa5847 fixes in db creation 2020-11-10 17:11:12 +02:00
Claudio Atzori ee832f358e Merge pull request 'stats_wf_extensions_and_corrections' (#28) from spyros/dnet-hadoop:stats_wf_extensions_and_corrections into master
Thank you Guys! The update workflow will be made available to the beta & production orchestration systems under the HDFS path

```/lib/dnet/oa/graph/stats/oozie_app```
2020-07-27 16:02:03 +02:00
Antonis Lempesis 4ac8ebe427 correctly calculating the project duration 2020-07-24 19:50:40 +03:00
Antonis Lempesis 18d9464b52 creating shadow db only if it not exists... 2020-07-24 19:50:40 +03:00
Antonis Lempesis e217d496ab added the dest db... 2020-07-24 19:50:40 +03:00
Antonis Lempesis b16bb68b9f added the target db name... 2020-07-24 19:50:40 +03:00
Antonis Lempesis 1ee7eeedf3 added the source db name... 2020-07-24 19:50:40 +03:00
Antonis Lempesis cecbbfa0fc added missing tables and views: contexts, creation_date, funder 2020-07-24 19:50:40 +03:00
Antonis Lempesis 25b7a615f5 moved datasource_sources table creating in the datasource section 2020-07-24 19:50:40 +03:00
Antonis Lempesis a8da4ab9c0 years in projects are now integers 2020-07-24 19:50:40 +03:00
Antonis Lempesis c9cfc165d9 not using impala since the resulting tables are not visible 2020-07-24 19:50:40 +03:00
Antonis Lempesis dd3d6a6e15 compute stats for the used and new impala tables 2020-07-24 19:50:40 +03:00
Antonis Lempesis e6f50de6ef Separated impala from hive steps 2020-07-24 19:50:40 +03:00
Antonis Lempesis de49173420 fixed a typo in queries 2020-07-24 19:50:40 +03:00
antleb 391cf80fb8 Added peer-reviewed, green, gold tables and fields in result. Added shortcuts from result-country 2020-07-24 19:50:40 +03:00
antleb 68389d0125 Corrected the script used by the last step of the wf 2020-07-24 19:50:40 +03:00
antleb ec52141f1a changed refereed type from value to clssname 2020-07-24 19:50:40 +03:00
Spyros Zoupanos 63cd797aba Comment out step 15 to make it work with the new schema of Claudio 2020-07-24 19:50:40 +03:00
Spyros Zoupanos 138c6ddffa Insert statement to datasource table that takes into account the piwik_id of the openAIRE graph 2020-07-24 19:50:40 +03:00
Spyros Zoupanos 3630794cef Fix to consider the relationships that have been 'virtually deleted' for project_results - defect #5607 2020-07-24 19:50:40 +03:00
Spyros Zoupanos 5546f29e63 Corrections on the shadow schema and the impala table stats calculation 2020-07-24 19:50:40 +03:00
Spyros Zoupanos adf8a025d2 Adding more relations (Sources, Licences, Additional) and shadow schema as provided and discussed with Antonis Lempesis 2020-07-24 19:50:40 +03:00
Spyros Zoupanos 657a40536b Corrections by Spyros: Scipt cleanup, corrections and re-arrangement 2020-07-24 19:50:40 +03:00
Giorgos Alexiou 477fa6234d Script re-organisation and adding table invalidations needed for impala 2020-07-24 19:50:40 +03:00
Claudio Atzori 9cd27183b6 [maven-release-plugin] prepare for next development iteration 2020-06-22 11:27:44 +02:00
Claudio Atzori 1e3dab0631 [maven-release-plugin] prepare release dhp-1.2.3 2020-06-22 11:27:39 +02:00
Claudio Atzori c4d9f1837f [maven-release-plugin] prepare for next development iteration 2020-06-12 12:21:08 +02:00
Claudio Atzori f0746a7605 [maven-release-plugin] prepare release dhp-1.2.2 2020-06-12 12:21:03 +02:00
Spyros Zoupanos 3576dd186b Adding hive timeout as workflow parameter 2020-06-05 22:29:54 +03:00
Claudio Atzori 7582532e73 [maven-release-plugin] prepare for next development iteration 2020-05-25 19:48:18 +02:00
Claudio Atzori 01c2e93395 [maven-release-plugin] prepare release dhp-1.2.1 2020-05-25 19:48:14 +02:00
Claudio Atzori 60c40618d3 [maven-release-plugin] prepare for next development iteration 2020-05-11 10:17:14 +02:00
Claudio Atzori c267d958d5 [maven-release-plugin] prepare release dhp-1.2.0 2020-05-11 10:17:10 +02:00
Claudio Atzori 42f1a2bf94 bumped project version to 1.2.0-SNAPSHOT 2020-05-11 10:05:57 +02:00
Spyros Zoupanos ae0f535c73 Fixing hardcoded reference to main openAIRE graph db 2020-05-09 22:34:48 +03:00
Claudio Atzori 0ccc864ad9 [maven-release-plugin] prepare for next development iteration 2020-05-08 17:01:31 +02:00
Claudio Atzori 6e47c724c6 [maven-release-plugin] prepare release dhp-1.1.7 2020-05-08 17:01:27 +02:00
Claudio Atzori 077ccd8743 stats wf properties cleanup 2020-05-04 11:41:46 +02:00
Claudio Atzori 77ac995770 cleaned up poms, added descriptions 2020-04-29 18:44:17 +02:00
Claudio Atzori 8fd81e863d added default value for the external_stats_db_name 2020-04-29 15:36:24 +02:00
Claudio Atzori c6f3ff4462 stats workflow content relocated into common package; added <global> property definitions in stats workflow.xml 2020-04-29 14:29:27 +02:00
Michele Artini c43b4c8962 formatting 2020-04-29 12:56:58 +02:00
Spyros Zoupanos 1ab97bbe00 Adding the full stats workflow to the dnet-hadoop hierarchy 2020-04-01 22:22:05 +03:00