Commit Graph

277 Commits

Author SHA1 Message Date
Antonis Lempesis 915f758c82 moving data to impala cluster and creating shadow databases there 2021-12-13 16:26:14 +02:00
Antonis Lempesis d05210ba99 finished migration to hive only 2021-11-30 19:01:48 +02:00
dimitrispie 09fc2afdca Added indi_funder_country_collab
Kept only indi_pub_has_cc_licence
2021-11-26 16:13:10 +02:00
Antonis Lempesis 0b4163ee0b added sprint3,4, removed 2, chaos 2021-11-26 15:58:01 +02:00
Antonis Lempesis 12749a0a77 first 2021-11-26 15:40:40 +02:00
dimitrispie 29f69f2f89 Sprint 4 2021-11-26 15:22:04 +02:00
Antonis Lempesis cb3adb90f4 Merge branch 'beta' into beta 2021-11-17 14:33:45 +01:00
Antonis Lempesis c283406829 added Universidad Polytecnica de Madrid 2021-11-17 15:33:00 +02:00
Antonis Lempesis 26f086dd64 removed the too restrctive clause. will discuss again 2021-11-11 12:57:19 +02:00
Antonis Lempesis 91354c6068 - fetching all context related results
- storing tables as parquet
2021-11-08 15:15:46 +02:00
Claudio Atzori 7fa49f6956 Merge pull request 'removed hardcoded reference' (#154) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#154
2021-11-02 09:11:30 +01:00
Antonis Lempesis f78afb5ef9 removed hardcoded reference 2021-11-01 15:42:29 +02:00
Claudio Atzori 4f8970f8ed [stats] reducing the step22 wait time 2021-10-20 14:14:53 +02:00
Antonis Lempesis 241dcf6df1 Merge branch 'beta' into beta 2021-10-19 23:54:21 +02:00
Antonis Lempesis 41ecb1eb61 invalidating medatadata before context thingies 2021-10-15 13:42:55 +03:00
Antonis Lempesis 4b7c8dff2d fetching affiliated results for 4 orgs in monitor. fixed affiliated orgs in stats db 2021-10-14 18:53:35 +03:00
Claudio Atzori b292e4a700 [stats wf] added extra logging in the context data retrieval phase 2021-10-13 17:31:53 +02:00
dimitrispie 3f25d2efb2 Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta 2021-10-01 16:03:48 +03:00
dimitrispie 13687fd887 Sprint 3 indicators update 2021-10-01 16:02:02 +03:00
Antonis Lempesis a1e1cf32d7 fixed an impala error 2021-09-24 12:57:24 +03:00
Antonis Lempesis f358cabb2b fixed typo 2021-09-22 21:50:37 +03:00
Antonis Lempesis 421d55265d created hive action for observatory queries 2021-09-21 03:07:58 +03:00
Antonis Lempesis 8b681dcf1b attempt to make the observatory wf run in hive 2021-09-18 00:35:14 +03:00
Antonis Lempesis 2943287d10 fixed the definition of cc_licence, part II 2021-09-16 15:59:06 +03:00
Antonis Lempesis dd2329849f fixed the definition of cc_licence 2021-09-16 13:50:34 +03:00
Antonis Lempesis de9bf3a161 added cc_licences and abstracts in observatory db 2021-09-14 01:29:08 +03:00
Antonis Lempesis 9b1936701c fixed yet another typo 2021-09-13 21:07:44 +03:00
Antonis Lempesis 8fc89ae822 moved context table creation before indicators 2021-09-13 14:33:23 +03:00
Antonis Lempesis 461bf90ca6 fixed the gold_oa definition 2021-09-13 11:10:30 +03:00
Antonis Lempesis 43852bac0e creating other::other concept for all contexts 2021-09-13 01:36:41 +03:00
Antonis Lempesis f13cca7e83 moved dependencies of indicators before them... 2021-09-08 23:07:58 +03:00
Antonis Lempesis c6ada217a1 fixed typo 2021-09-08 22:34:59 +03:00
Antonis Lempesis 1250ae197f using new indicators for the definition of peerreviewed, gold, and green 2021-09-08 14:08:43 +03:00
Antonis Lempesis ccee451dde added indicators of sprint 2 in monitor db 2021-09-07 23:17:13 +03:00
Antonis Lempesis 117c3d5c67 fixed a typo 2021-08-02 12:15:58 +03:00
Antonis Lempesis 26af0320d0 added the sprint 2 indicators in monitor db 2021-07-30 00:31:33 +03:00
Antonis Lempesis 4afa5215a9 fixed a NPE? 2021-07-28 21:59:12 +03:00
Antonis Lempesis 3d1580fa9b fixed a typo 2021-07-28 18:50:31 +03:00
Antonis Lempesis 9b181ffa73 added the h2020 classification scheme for projects 2021-07-28 16:31:29 +03:00
Antonis Lempesis 4a9741825d added result_orcid, result_project provenance, issn in datasources 2021-07-28 12:28:04 +03:00
Antonis Lempesis 1a28a69cac changed the citeee in *_citations to cites 2021-07-27 15:14:09 +03:00
Antonis Lempesis ed185fd7ed added missing colons 2021-07-27 11:42:47 +03:00
Antonis Lempesis f3b9570354 properly invalidating metadata 2021-07-26 13:00:16 +03:00
Antonis Lempesis f9fbb0f261 added indicators second sprint 2021-07-24 16:40:28 +03:00
Antonis Lempesis 89e6f46682 using organization ids instead of names in monitor db creation 2021-07-05 12:00:00 +03:00
Antonis Lempesis 87f14a3899 added the missing indicators files 2021-06-29 16:31:51 +03:00
Antonis Lempesis 018c4eb52c copied latest changes from old fork: indicators+monitor institutions 2021-06-28 23:46:52 +03:00
Antonis Lempesis f7c0b80e35 storing result_instance as parquet 2021-06-15 14:45:48 +03:00
Antonis Lempesis d413b24611 added instances, orgs for monitor, totalcost for projects, apcs 2021-06-10 02:35:46 +03:00
Antonis Lempesis 168edcbde3 added the final steps for the observatory promote wf and some cleanup 2021-05-18 15:23:20 +03:00
Antonis Lempesis 625d993cd9 added step for observatory db 2021-04-20 02:31:06 +03:00
Antonis Lempesis 25d0512fbd code cleanup 2021-04-20 01:43:23 +03:00
Antonis Lempesis 03d36fadea properly invalidating impala metadata 2021-04-15 13:34:22 +03:00
Antonis Lempesis 236435b470 following redirects 2021-03-12 14:11:21 +02:00
Antonis Lempesis 3c75a05044 fixed a ton of typos 2021-03-12 13:47:04 +02:00
Antonis Lempesis fa1ec5b5e9 fixed typo... 2021-03-10 14:05:58 +02:00
Antonis Lempesis f40c150a0d fixed steps... 2021-03-06 00:35:57 +02:00
Antonis Lempesis 6147ee4950 assigning correctly hive contexts to concepts 2021-03-05 14:12:18 +02:00
Antonis Lempesis c5fbad8093 Contexts are now downloaded instead of using the stats_ext db 2021-03-04 00:42:21 +02:00
Antonis Lempesis 27796343ca crude sleep. hardcoded value 2021-03-03 01:37:47 +02:00
Antonis Lempesis d90767c733 correctly invalidating metadata 2021-02-19 03:18:47 +02:00
Antonis Lempesis 3681afbe04 typo 2021-02-19 03:04:27 +02:00
Antonis Lempesis c5502eba8f actually moved stats computation in impala instead of hive... 2021-02-19 02:54:39 +02:00
Antonis Lempesis 33c85d4e66 moved stats computation in impala instead of hive 2021-02-18 17:23:34 +02:00
Antonis Lempesis b8e96c8ae7 moved cache update to the end 2021-02-18 16:42:22 +02:00
Antonis Lempesis bcbfc052b1 fixed last errors in step 21 2021-02-18 16:32:54 +02:00
Antonis Lempesis 10a29a4b9a fixes in monitor step 2021-02-18 15:05:59 +02:00
Antonis Lempesis 8ef66452d5 fixed typo 2021-02-17 22:24:44 +02:00
Antonis Lempesis a8836e2f5f fixed typo 2021-02-17 19:27:07 +02:00
Antonis Lempesis a445c1ac3d fixed variable names in monitor script 2021-02-17 16:45:09 +02:00
Antonis Lempesis 00d516360f added missing ; 2021-02-17 16:41:10 +02:00
Antonis Lempesis cd1b794409 added the monitor db wf 2021-02-17 02:11:55 +02:00
Antonis Lempesis 1c029b9fc0 fixed formatting 2021-02-14 03:14:24 +02:00
Antonis Lempesis 2c4dcc90ba analyzing tables to produce stats 2021-02-14 02:54:55 +02:00
Antonis Lempesis be5969a8c2 Changed typo in script names 2020-12-22 13:33:32 +02:00
Antonis Lempesis 2a074c3b2b Changed typo in script names 2020-12-18 18:40:48 +02:00
Antonis Lempesis 7cb113e088 added the new parameter (stats_tool_api_url) in the workflow parameters 2020-12-04 13:04:25 +02:00
Antonis Lempesis d23ccae0d5 ignoring deletedbyinference relations 2020-12-04 12:42:17 +02:00
Antonis Lempesis 413afcfed5 finished first implementation of wf 2020-12-02 15:57:17 +02:00
Antonis Lempesis 815d6b25d9 added last step to update cache 2020-11-30 00:48:10 +02:00
Antonis Lempesis 01a6e03989 starting from first step... 2020-11-17 23:26:47 +02:00
Antonis Lempesis 99ebaee347 fixed #5913 2020-11-11 16:56:46 +02:00
Antonis Lempesis f14e65f6a3 reverted wrong change 2020-11-10 17:23:04 +02:00
Antonis Lempesis c02c7741c9 fixes in db creation 2020-11-10 17:11:30 +02:00
Antonis Lempesis e603fa5847 fixes in db creation 2020-11-10 17:11:12 +02:00
Claudio Atzori ee832f358e Merge pull request 'stats_wf_extensions_and_corrections' (#28) from spyros/dnet-hadoop:stats_wf_extensions_and_corrections into master
Thank you Guys! The update workflow will be made available to the beta & production orchestration systems under the HDFS path

```/lib/dnet/oa/graph/stats/oozie_app```
2020-07-27 16:02:03 +02:00
Antonis Lempesis 4ac8ebe427 correctly calculating the project duration 2020-07-24 19:50:40 +03:00
Antonis Lempesis 18d9464b52 creating shadow db only if it not exists... 2020-07-24 19:50:40 +03:00
Antonis Lempesis e217d496ab added the dest db... 2020-07-24 19:50:40 +03:00
Antonis Lempesis b16bb68b9f added the target db name... 2020-07-24 19:50:40 +03:00
Antonis Lempesis 1ee7eeedf3 added the source db name... 2020-07-24 19:50:40 +03:00
Antonis Lempesis cecbbfa0fc added missing tables and views: contexts, creation_date, funder 2020-07-24 19:50:40 +03:00
Antonis Lempesis 25b7a615f5 moved datasource_sources table creating in the datasource section 2020-07-24 19:50:40 +03:00
Antonis Lempesis a8da4ab9c0 years in projects are now integers 2020-07-24 19:50:40 +03:00
Antonis Lempesis c9cfc165d9 not using impala since the resulting tables are not visible 2020-07-24 19:50:40 +03:00
Antonis Lempesis dd3d6a6e15 compute stats for the used and new impala tables 2020-07-24 19:50:40 +03:00
Antonis Lempesis e6f50de6ef Separated impala from hive steps 2020-07-24 19:50:40 +03:00
Antonis Lempesis de49173420 fixed a typo in queries 2020-07-24 19:50:40 +03:00
antleb 391cf80fb8 Added peer-reviewed, green, gold tables and fields in result. Added shortcuts from result-country 2020-07-24 19:50:40 +03:00
antleb 68389d0125 Corrected the script used by the last step of the wf 2020-07-24 19:50:40 +03:00
antleb ec52141f1a changed refereed type from value to clssname 2020-07-24 19:50:40 +03:00
Spyros Zoupanos 63cd797aba Comment out step 15 to make it work with the new schema of Claudio 2020-07-24 19:50:40 +03:00
Spyros Zoupanos 138c6ddffa Insert statement to datasource table that takes into account the piwik_id of the openAIRE graph 2020-07-24 19:50:40 +03:00
Spyros Zoupanos 3630794cef Fix to consider the relationships that have been 'virtually deleted' for project_results - defect #5607 2020-07-24 19:50:40 +03:00
Spyros Zoupanos 5546f29e63 Corrections on the shadow schema and the impala table stats calculation 2020-07-24 19:50:40 +03:00
Spyros Zoupanos adf8a025d2 Adding more relations (Sources, Licences, Additional) and shadow schema as provided and discussed with Antonis Lempesis 2020-07-24 19:50:40 +03:00
Spyros Zoupanos 657a40536b Corrections by Spyros: Scipt cleanup, corrections and re-arrangement 2020-07-24 19:50:40 +03:00
Giorgos Alexiou 477fa6234d Script re-organisation and adding table invalidations needed for impala 2020-07-24 19:50:40 +03:00
Claudio Atzori 9cd27183b6 [maven-release-plugin] prepare for next development iteration 2020-06-22 11:27:44 +02:00
Claudio Atzori 1e3dab0631 [maven-release-plugin] prepare release dhp-1.2.3 2020-06-22 11:27:39 +02:00
Claudio Atzori c4d9f1837f [maven-release-plugin] prepare for next development iteration 2020-06-12 12:21:08 +02:00
Claudio Atzori f0746a7605 [maven-release-plugin] prepare release dhp-1.2.2 2020-06-12 12:21:03 +02:00
Spyros Zoupanos 3576dd186b Adding hive timeout as workflow parameter 2020-06-05 22:29:54 +03:00
Claudio Atzori 7582532e73 [maven-release-plugin] prepare for next development iteration 2020-05-25 19:48:18 +02:00
Claudio Atzori 01c2e93395 [maven-release-plugin] prepare release dhp-1.2.1 2020-05-25 19:48:14 +02:00
Claudio Atzori 60c40618d3 [maven-release-plugin] prepare for next development iteration 2020-05-11 10:17:14 +02:00
Claudio Atzori c267d958d5 [maven-release-plugin] prepare release dhp-1.2.0 2020-05-11 10:17:10 +02:00
Claudio Atzori 42f1a2bf94 bumped project version to 1.2.0-SNAPSHOT 2020-05-11 10:05:57 +02:00
Spyros Zoupanos ae0f535c73 Fixing hardcoded reference to main openAIRE graph db 2020-05-09 22:34:48 +03:00
Claudio Atzori 0ccc864ad9 [maven-release-plugin] prepare for next development iteration 2020-05-08 17:01:31 +02:00
Claudio Atzori 6e47c724c6 [maven-release-plugin] prepare release dhp-1.1.7 2020-05-08 17:01:27 +02:00
Claudio Atzori 077ccd8743 stats wf properties cleanup 2020-05-04 11:41:46 +02:00
Claudio Atzori 77ac995770 cleaned up poms, added descriptions 2020-04-29 18:44:17 +02:00
Claudio Atzori 8fd81e863d added default value for the external_stats_db_name 2020-04-29 15:36:24 +02:00
Claudio Atzori c6f3ff4462 stats workflow content relocated into common package; added <global> property definitions in stats workflow.xml 2020-04-29 14:29:27 +02:00
Michele Artini c43b4c8962 formatting 2020-04-29 12:56:58 +02:00
Spyros Zoupanos 1ab97bbe00 Adding the full stats workflow to the dnet-hadoop hierarchy 2020-04-01 22:22:05 +03:00