Claudio Atzori
465e941214
Merge pull request '[stats wf] Changes to indicators tables' ( #244 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #244
2022-09-16 10:13:58 +02:00
dimitrispie
3bf3127251
Changes to monitor and indicator scripts
2022-09-14 16:36:19 +03:00
dimitrispie
71b069ca90
Changes to indicator and monitor scripts
2022-09-09 13:15:58 +03:00
dimitrispie
2b5f8c9c9a
comment out duplicate table creation
2022-09-06 12:27:53 +03:00
Claudio Atzori
84598c7535
Merge pull request 'restored some collab indicators' ( #240 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #240
2022-08-05 15:50:39 +02:00
Antonis Lempesis
fcef5294e2
restored some collab indicators
2022-08-05 13:45:01 +03:00
Claudio Atzori
c1f2ffc53d
Merge pull request 'commenting out the collab indicators because they still fail' ( #237 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #237
2022-08-05 11:57:36 +02:00
Antonis Lempesis
227e10f4b3
commenting out the collab indicators because they still fail
2022-08-05 12:54:36 +03:00
Claudio Atzori
efd96e7e66
Merge pull request 'fixed the datasourceOrganization relations' ( #233 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #233
2022-08-03 12:25:05 +02:00
Antonis Lempesis
8b0407d8ec
fixed the datasourceOrganization relations
2022-08-03 12:26:59 +03:00
Claudio Atzori
27681cf6bf
Merge pull request '[stats wf] latest version of indicators + added FOS classification' ( #232 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #232
2022-08-02 12:57:15 +02:00
Antonis Lempesis
1778d40c40
latest version of indicators
2022-08-02 13:39:34 +03:00
Antonis Lempesis
6fc9ef53f6
addded command line params to allow hive actions to run
2022-07-29 16:36:20 +03:00
Antonis Lempesis
9886fe87ec
- Added FOS classification
...
- Added extra orgs in monitor
- Fixed result-project and organization-project tables
2022-07-29 16:34:50 +03:00
Miriam Baglioni
b229c6e7af
Merge pull request 'beta' ( #218 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #218
2022-06-10 11:03:48 +02:00
Antonis Lempesis
ab18c9daa9
Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta
2022-06-09 15:48:21 +03:00
Antonis Lempesis
574492c659
removed double result_apc table creation from monitor
2022-06-09 15:48:13 +03:00
Antonis Lempesis
db088cc69c
fixed *_organization tables
2022-06-07 04:04:28 +03:00
Claudio Atzori
5c2949a864
Merge pull request '[stats wf] added open citations & more orgs in monitor, removed collab indicator' ( #213 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #213
2022-05-20 11:38:43 +02:00
Antonis Lempesis
3fc9efeab6
fixed typo, addded open citations and apcs in monitor
2022-05-13 14:28:13 +03:00
Antonis Lempesis
23334479bb
removed yet another collab, added more orgs in monitor
2022-05-11 13:05:52 +03:00
Antonis Lempesis
61b4c19e65
restored indi_result_org_country_collab, removed indi_result_org_collab
2022-05-06 12:52:10 +03:00
Antonis Lempesis
cfbbcaf7c4
commented out indi_result_org_country_collab
2022-05-06 12:49:36 +03:00
Antonis Lempesis
0353f93d54
added new hive opts
2022-04-29 12:49:27 +03:00
Antonis Lempesis
b7cd2c6ca1
added open citations
2022-04-20 14:46:55 +03:00
Claudio Atzori
4eff7856f5
Merge pull request '[stats-wf] computing stats in each step' ( #210 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #210
2022-04-08 14:21:01 +02:00
Claudio Atzori
c26222623f
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:32:22 +02:00
Claudio Atzori
86585a6b27
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:32:19 +02:00
Claudio Atzori
ad85d88eaf
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 13:28:35 +02:00
Claudio Atzori
598e11dfd7
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:27:02 +02:00
Claudio Atzori
db3d9877a5
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:26:58 +02:00
Claudio Atzori
3bba6d6e38
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 12:23:17 +02:00
Claudio Atzori
2ac2d928bd
[maven-release-plugin] prepare for next development iteration
2022-04-07 12:18:47 +02:00
Claudio Atzori
85bc722ff4
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 12:18:43 +02:00
Claudio Atzori
bc05b6168a
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 11:49:06 +02:00
Claudio Atzori
505420fd61
[maven-release-plugin] prepare for next development iteration
2022-04-07 11:34:06 +02:00
Claudio Atzori
66e718981e
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 11:34:02 +02:00
Antonis Lempesis
c442c91f89
computing stats in each step
2022-04-06 12:40:02 +03:00
Antonis Lempesis
7112806a73
views cannot be stored as parquet...
2022-03-29 16:37:29 +03:00
Antonis Lempesis
fff0b3cc19
added apcs in monitor db
2022-03-29 14:15:31 +03:00
Antonis Lempesis
ee24f3eb2c
views cannot be stored as parquet...
2022-03-29 13:47:48 +03:00
Antonis Lempesis
d8503cd191
added moooar organizations
2022-03-24 14:02:36 +02:00
Antonis Lempesis
62f91b0869
cleanup
2022-03-22 16:17:49 +02:00
Antonis Lempesis
2e8394ecf8
creating aaall tables as parquet
2022-03-22 16:16:08 +02:00
Antonis Lempesis
dcfbeb8142
yet more typos
2022-03-21 12:36:03 +02:00
Antonis Lempesis
ad78e505da
yet another fix
2022-03-03 12:28:12 +02:00
Antonis Lempesis
efeeebfee1
fixed query after the change in the indicator table
2022-03-02 13:29:25 +02:00
Antonis Lempesis
3b92a2ab9c
added the rest of spring 6 in monitor db
2022-02-23 12:05:57 +02:00
Antonis Lempesis
87c91f70a2
added sprint 6 indicators to monitor db
2022-02-22 14:41:48 +02:00
dimitrispie
58c59f46eb
Added Sprint 6
2022-02-17 10:21:09 +02:00
Antonis Lempesis
5772f92dba
merged beta chnages in hive branch
2022-02-15 13:24:51 +02:00
Antonis Lempesis
393a4ee956
fixed yet another typo...
2022-02-15 12:56:50 +02:00
Antonis Lempesis
5f762cbd09
fixed yet another typo
2022-02-07 12:09:12 +02:00
Antonis Lempesis
ae633c566b
fixed the result_result table
2022-02-04 15:04:19 +02:00
Antonis Lempesis
c2b44530a3
typo...
2022-02-03 13:44:07 +02:00
Antonis Lempesis
dbd2646d59
fixed the result_result creation for monitor
2022-02-03 12:37:10 +02:00
Antonis Lempesis
81ee654271
added result_result relations
2021-12-23 15:46:17 +02:00
Antonis Lempesis
7551e52e95
fixed a typo
2021-12-23 15:33:53 +02:00
Antonis Lempesis
16539d7360
added usage stats
2021-12-22 02:54:42 +02:00
Antonis Lempesis
3edd661608
fixed column names
2021-12-21 22:55:04 +02:00
Antonis Lempesis
a4c0cbb98c
fixed typos in indicators. Added extra views in monitor
2021-12-21 15:54:38 +02:00
Antonis Lempesis
58996972d9
added first indicator of sprint 5
2021-12-21 03:35:04 +02:00
dimitrispie
c1cdec09a9
Sprint 5 and other changes
2021-12-20 19:23:57 +02:00
Antonis Lempesis
ddd34087c2
removed 'stored as parquet' from views..
2021-12-13 23:05:00 +02:00
Antonis Lempesis
915f758c82
moving data to impala cluster and creating shadow databases there
2021-12-13 16:26:14 +02:00
Antonis Lempesis
d05210ba99
finished migration to hive only
2021-11-30 19:01:48 +02:00
dimitrispie
09fc2afdca
Added indi_funder_country_collab
...
Kept only indi_pub_has_cc_licence
2021-11-26 16:13:10 +02:00
Antonis Lempesis
0b4163ee0b
added sprint3,4, removed 2, chaos
2021-11-26 15:58:01 +02:00
Antonis Lempesis
12749a0a77
first
2021-11-26 15:40:40 +02:00
dimitrispie
29f69f2f89
Sprint 4
2021-11-26 15:22:04 +02:00
Antonis Lempesis
cb3adb90f4
Merge branch 'beta' into beta
2021-11-17 14:33:45 +01:00
Antonis Lempesis
c283406829
added Universidad Polytecnica de Madrid
2021-11-17 15:33:00 +02:00
Antonis Lempesis
26f086dd64
removed the too restrctive clause. will discuss again
2021-11-11 12:57:19 +02:00
Antonis Lempesis
91354c6068
- fetching all context related results
...
- storing tables as parquet
2021-11-08 15:15:46 +02:00
Claudio Atzori
7fa49f6956
Merge pull request 'removed hardcoded reference' ( #154 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #154
2021-11-02 09:11:30 +01:00
Antonis Lempesis
f78afb5ef9
removed hardcoded reference
2021-11-01 15:42:29 +02:00
Claudio Atzori
4f8970f8ed
[stats] reducing the step22 wait time
2021-10-20 14:14:53 +02:00
Antonis Lempesis
241dcf6df1
Merge branch 'beta' into beta
2021-10-19 23:54:21 +02:00
Antonis Lempesis
41ecb1eb61
invalidating medatadata before context thingies
2021-10-15 13:42:55 +03:00
Antonis Lempesis
4b7c8dff2d
fetching affiliated results for 4 orgs in monitor. fixed affiliated orgs in stats db
2021-10-14 18:53:35 +03:00
Claudio Atzori
b292e4a700
[stats wf] added extra logging in the context data retrieval phase
2021-10-13 17:31:53 +02:00
dimitrispie
3f25d2efb2
Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta
2021-10-01 16:03:48 +03:00
dimitrispie
13687fd887
Sprint 3 indicators update
2021-10-01 16:02:02 +03:00
Antonis Lempesis
a1e1cf32d7
fixed an impala error
2021-09-24 12:57:24 +03:00
Antonis Lempesis
f358cabb2b
fixed typo
2021-09-22 21:50:37 +03:00
Antonis Lempesis
421d55265d
created hive action for observatory queries
2021-09-21 03:07:58 +03:00
Antonis Lempesis
8b681dcf1b
attempt to make the observatory wf run in hive
2021-09-18 00:35:14 +03:00
Antonis Lempesis
2943287d10
fixed the definition of cc_licence, part II
2021-09-16 15:59:06 +03:00
Antonis Lempesis
dd2329849f
fixed the definition of cc_licence
2021-09-16 13:50:34 +03:00
Antonis Lempesis
de9bf3a161
added cc_licences and abstracts in observatory db
2021-09-14 01:29:08 +03:00
Antonis Lempesis
9b1936701c
fixed yet another typo
2021-09-13 21:07:44 +03:00
Antonis Lempesis
8fc89ae822
moved context table creation before indicators
2021-09-13 14:33:23 +03:00
Antonis Lempesis
461bf90ca6
fixed the gold_oa definition
2021-09-13 11:10:30 +03:00
Antonis Lempesis
43852bac0e
creating other::other concept for all contexts
2021-09-13 01:36:41 +03:00
Antonis Lempesis
f13cca7e83
moved dependencies of indicators before them...
2021-09-08 23:07:58 +03:00
Antonis Lempesis
c6ada217a1
fixed typo
2021-09-08 22:34:59 +03:00
Antonis Lempesis
1250ae197f
using new indicators for the definition of peerreviewed, gold, and green
2021-09-08 14:08:43 +03:00
Antonis Lempesis
ccee451dde
added indicators of sprint 2 in monitor db
2021-09-07 23:17:13 +03:00
Antonis Lempesis
117c3d5c67
fixed a typo
2021-08-02 12:15:58 +03:00
Antonis Lempesis
26af0320d0
added the sprint 2 indicators in monitor db
2021-07-30 00:31:33 +03:00
Antonis Lempesis
4afa5215a9
fixed a NPE?
2021-07-28 21:59:12 +03:00
Antonis Lempesis
3d1580fa9b
fixed a typo
2021-07-28 18:50:31 +03:00
Antonis Lempesis
9b181ffa73
added the h2020 classification scheme for projects
2021-07-28 16:31:29 +03:00
Antonis Lempesis
4a9741825d
added result_orcid, result_project provenance, issn in datasources
2021-07-28 12:28:04 +03:00
Antonis Lempesis
1a28a69cac
changed the citeee in *_citations to cites
2021-07-27 15:14:09 +03:00
Antonis Lempesis
ed185fd7ed
added missing colons
2021-07-27 11:42:47 +03:00
Antonis Lempesis
f3b9570354
properly invalidating metadata
2021-07-26 13:00:16 +03:00
Antonis Lempesis
f9fbb0f261
added indicators second sprint
2021-07-24 16:40:28 +03:00
Antonis Lempesis
89e6f46682
using organization ids instead of names in monitor db creation
2021-07-05 12:00:00 +03:00
Antonis Lempesis
87f14a3899
added the missing indicators files
2021-06-29 16:31:51 +03:00
Antonis Lempesis
018c4eb52c
copied latest changes from old fork: indicators+monitor institutions
2021-06-28 23:46:52 +03:00
Antonis Lempesis
f7c0b80e35
storing result_instance as parquet
2021-06-15 14:45:48 +03:00
Antonis Lempesis
d413b24611
added instances, orgs for monitor, totalcost for projects, apcs
2021-06-10 02:35:46 +03:00
Antonis Lempesis
168edcbde3
added the final steps for the observatory promote wf and some cleanup
2021-05-18 15:23:20 +03:00
Antonis Lempesis
625d993cd9
added step for observatory db
2021-04-20 02:31:06 +03:00
Antonis Lempesis
25d0512fbd
code cleanup
2021-04-20 01:43:23 +03:00
Antonis Lempesis
03d36fadea
properly invalidating impala metadata
2021-04-15 13:34:22 +03:00
Antonis Lempesis
236435b470
following redirects
2021-03-12 14:11:21 +02:00
Antonis Lempesis
3c75a05044
fixed a ton of typos
2021-03-12 13:47:04 +02:00
Antonis Lempesis
fa1ec5b5e9
fixed typo...
2021-03-10 14:05:58 +02:00
Antonis Lempesis
f40c150a0d
fixed steps...
2021-03-06 00:35:57 +02:00
Antonis Lempesis
6147ee4950
assigning correctly hive contexts to concepts
2021-03-05 14:12:18 +02:00
Antonis Lempesis
c5fbad8093
Contexts are now downloaded instead of using the stats_ext db
2021-03-04 00:42:21 +02:00
Antonis Lempesis
27796343ca
crude sleep. hardcoded value
2021-03-03 01:37:47 +02:00
Antonis Lempesis
d90767c733
correctly invalidating metadata
2021-02-19 03:18:47 +02:00
Antonis Lempesis
3681afbe04
typo
2021-02-19 03:04:27 +02:00
Antonis Lempesis
c5502eba8f
actually moved stats computation in impala instead of hive...
2021-02-19 02:54:39 +02:00
Antonis Lempesis
33c85d4e66
moved stats computation in impala instead of hive
2021-02-18 17:23:34 +02:00
Antonis Lempesis
b8e96c8ae7
moved cache update to the end
2021-02-18 16:42:22 +02:00
Antonis Lempesis
bcbfc052b1
fixed last errors in step 21
2021-02-18 16:32:54 +02:00
Antonis Lempesis
10a29a4b9a
fixes in monitor step
2021-02-18 15:05:59 +02:00
Antonis Lempesis
8ef66452d5
fixed typo
2021-02-17 22:24:44 +02:00
Antonis Lempesis
a8836e2f5f
fixed typo
2021-02-17 19:27:07 +02:00
Antonis Lempesis
a445c1ac3d
fixed variable names in monitor script
2021-02-17 16:45:09 +02:00
Antonis Lempesis
00d516360f
added missing ;
2021-02-17 16:41:10 +02:00
Antonis Lempesis
cd1b794409
added the monitor db wf
2021-02-17 02:11:55 +02:00
Antonis Lempesis
1c029b9fc0
fixed formatting
2021-02-14 03:14:24 +02:00
Antonis Lempesis
2c4dcc90ba
analyzing tables to produce stats
2021-02-14 02:54:55 +02:00
Antonis Lempesis
be5969a8c2
Changed typo in script names
2020-12-22 13:33:32 +02:00
Antonis Lempesis
2a074c3b2b
Changed typo in script names
2020-12-18 18:40:48 +02:00
Antonis Lempesis
7cb113e088
added the new parameter (stats_tool_api_url) in the workflow parameters
2020-12-04 13:04:25 +02:00
Antonis Lempesis
d23ccae0d5
ignoring deletedbyinference relations
2020-12-04 12:42:17 +02:00
Antonis Lempesis
413afcfed5
finished first implementation of wf
2020-12-02 15:57:17 +02:00
Antonis Lempesis
815d6b25d9
added last step to update cache
2020-11-30 00:48:10 +02:00
Antonis Lempesis
01a6e03989
starting from first step...
2020-11-17 23:26:47 +02:00
Antonis Lempesis
99ebaee347
fixed #5913
2020-11-11 16:56:46 +02:00
Antonis Lempesis
f14e65f6a3
reverted wrong change
2020-11-10 17:23:04 +02:00
Antonis Lempesis
c02c7741c9
fixes in db creation
2020-11-10 17:11:30 +02:00
Antonis Lempesis
e603fa5847
fixes in db creation
2020-11-10 17:11:12 +02:00
Claudio Atzori
ee832f358e
Merge pull request 'stats_wf_extensions_and_corrections' ( #28 ) from spyros/dnet-hadoop:stats_wf_extensions_and_corrections into master
...
Thank you Guys! The update workflow will be made available to the beta & production orchestration systems under the HDFS path
```/lib/dnet/oa/graph/stats/oozie_app```
2020-07-27 16:02:03 +02:00
Antonis Lempesis
4ac8ebe427
correctly calculating the project duration
2020-07-24 19:50:40 +03:00
Antonis Lempesis
18d9464b52
creating shadow db only if it not exists...
2020-07-24 19:50:40 +03:00
Antonis Lempesis
e217d496ab
added the dest db...
2020-07-24 19:50:40 +03:00
Antonis Lempesis
b16bb68b9f
added the target db name...
2020-07-24 19:50:40 +03:00
Antonis Lempesis
1ee7eeedf3
added the source db name...
2020-07-24 19:50:40 +03:00
Antonis Lempesis
cecbbfa0fc
added missing tables and views: contexts, creation_date, funder
2020-07-24 19:50:40 +03:00
Antonis Lempesis
25b7a615f5
moved datasource_sources table creating in the datasource section
2020-07-24 19:50:40 +03:00
Antonis Lempesis
a8da4ab9c0
years in projects are now integers
2020-07-24 19:50:40 +03:00
Antonis Lempesis
c9cfc165d9
not using impala since the resulting tables are not visible
2020-07-24 19:50:40 +03:00
Antonis Lempesis
dd3d6a6e15
compute stats for the used and new impala tables
2020-07-24 19:50:40 +03:00
Antonis Lempesis
e6f50de6ef
Separated impala from hive steps
2020-07-24 19:50:40 +03:00
Antonis Lempesis
de49173420
fixed a typo in queries
2020-07-24 19:50:40 +03:00
antleb
391cf80fb8
Added peer-reviewed, green, gold tables and fields in result. Added shortcuts from result-country
2020-07-24 19:50:40 +03:00
antleb
68389d0125
Corrected the script used by the last step of the wf
2020-07-24 19:50:40 +03:00
antleb
ec52141f1a
changed refereed type from value to clssname
2020-07-24 19:50:40 +03:00
Spyros Zoupanos
63cd797aba
Comment out step 15 to make it work with the new schema of Claudio
2020-07-24 19:50:40 +03:00
Spyros Zoupanos
138c6ddffa
Insert statement to datasource table that takes into account the piwik_id of the openAIRE graph
2020-07-24 19:50:40 +03:00
Spyros Zoupanos
3630794cef
Fix to consider the relationships that have been 'virtually deleted' for project_results - defect #5607
2020-07-24 19:50:40 +03:00
Spyros Zoupanos
5546f29e63
Corrections on the shadow schema and the impala table stats calculation
2020-07-24 19:50:40 +03:00
Spyros Zoupanos
adf8a025d2
Adding more relations (Sources, Licences, Additional) and shadow schema as provided and discussed with Antonis Lempesis
2020-07-24 19:50:40 +03:00
Spyros Zoupanos
657a40536b
Corrections by Spyros: Scipt cleanup, corrections and re-arrangement
2020-07-24 19:50:40 +03:00
Giorgos Alexiou
477fa6234d
Script re-organisation and adding table invalidations needed for impala
2020-07-24 19:50:40 +03:00
Claudio Atzori
9cd27183b6
[maven-release-plugin] prepare for next development iteration
2020-06-22 11:27:44 +02:00
Claudio Atzori
1e3dab0631
[maven-release-plugin] prepare release dhp-1.2.3
2020-06-22 11:27:39 +02:00
Claudio Atzori
c4d9f1837f
[maven-release-plugin] prepare for next development iteration
2020-06-12 12:21:08 +02:00
Claudio Atzori
f0746a7605
[maven-release-plugin] prepare release dhp-1.2.2
2020-06-12 12:21:03 +02:00
Spyros Zoupanos
3576dd186b
Adding hive timeout as workflow parameter
2020-06-05 22:29:54 +03:00
Claudio Atzori
7582532e73
[maven-release-plugin] prepare for next development iteration
2020-05-25 19:48:18 +02:00
Claudio Atzori
01c2e93395
[maven-release-plugin] prepare release dhp-1.2.1
2020-05-25 19:48:14 +02:00
Claudio Atzori
60c40618d3
[maven-release-plugin] prepare for next development iteration
2020-05-11 10:17:14 +02:00
Claudio Atzori
c267d958d5
[maven-release-plugin] prepare release dhp-1.2.0
2020-05-11 10:17:10 +02:00
Claudio Atzori
42f1a2bf94
bumped project version to 1.2.0-SNAPSHOT
2020-05-11 10:05:57 +02:00
Spyros Zoupanos
ae0f535c73
Fixing hardcoded reference to main openAIRE graph db
2020-05-09 22:34:48 +03:00
Claudio Atzori
0ccc864ad9
[maven-release-plugin] prepare for next development iteration
2020-05-08 17:01:31 +02:00
Claudio Atzori
6e47c724c6
[maven-release-plugin] prepare release dhp-1.1.7
2020-05-08 17:01:27 +02:00
Claudio Atzori
077ccd8743
stats wf properties cleanup
2020-05-04 11:41:46 +02:00
Claudio Atzori
77ac995770
cleaned up poms, added descriptions
2020-04-29 18:44:17 +02:00
Claudio Atzori
8fd81e863d
added default value for the external_stats_db_name
2020-04-29 15:36:24 +02:00
Claudio Atzori
c6f3ff4462
stats workflow content relocated into common package; added <global> property definitions in stats workflow.xml
2020-04-29 14:29:27 +02:00
Michele Artini
c43b4c8962
formatting
2020-04-29 12:56:58 +02:00
Spyros Zoupanos
1ab97bbe00
Adding the full stats workflow to the dnet-hadoop hierarchy
2020-04-01 22:22:05 +03:00