Miriam Baglioni
|
99ac5bab46
|
added check to avoid NPE when checking the organization country
|
2023-05-04 19:38:39 +02:00 |
Claudio Atzori
|
0704e186f6
|
Merge pull request 'Stats wf executed on hive only' (#283) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#283
|
2023-05-02 14:05:12 +02:00 |
Claudio Atzori
|
d8882c4481
|
extended mapping applied to datacite records to produce affiliations using the ROR ids. Inc ase of APCs it includes the amount and the currently in the relation
|
2023-05-02 11:56:51 +02:00 |
dimitrispie
|
c3d58e58e1
|
Bug fixes
|
2023-05-02 11:54:07 +03:00 |
Claudio Atzori
|
abd7ca0c18
|
Merge branch 'beta' into bulkTagRefactor
|
2023-05-02 10:50:01 +02:00 |
Claudio Atzori
|
45f625d14f
|
Merge branch 'beta' into organizationToRepresentative
|
2023-05-02 10:46:55 +02:00 |
Claudio Atzori
|
de11edca98
|
Merge branch 'beta' into organizationToRepresentative
|
2023-05-02 09:59:41 +02:00 |
Claudio Atzori
|
851f664bd9
|
Merge branch 'beta' into graph_cleaning_refactoring
|
2023-05-02 09:55:40 +02:00 |
dimitrispie
|
e57ecdaf98
|
Update step20-createMonitorDB.sql
Add University of Manitoba
|
2023-04-30 17:52:23 +03:00 |
Ilias Kanellos
|
90332439ad
|
Remove deletion of synonym folder
|
2023-04-28 13:45:19 +03:00 |
Ilias Kanellos
|
a98da54896
|
Merge branch '8172_impact_indicators_workflow' of https://code-repo.d4science.org/D-Net/dnet-hadoop into 8172_impact_indicators_workflow
|
2023-04-28 13:23:49 +03:00 |
Ilias Kanellos
|
09485fbee3
|
Fixed unicode bug. Workflow ends after first script
|
2023-04-28 13:09:13 +03:00 |
Serafeim Chatzopoulos
|
614cc1089b
|
Add separate forder for results && project actionsets
|
2023-04-27 12:37:15 +03:00 |
Serafeim Chatzopoulos
|
815a4ddbba
|
Add actionset creation for project bip indicators in workflow
|
2023-04-26 20:40:06 +03:00 |
Serafeim Chatzopoulos
|
ee04cf92bf
|
Add actionsets for project impact indicators
|
2023-04-26 20:23:46 +03:00 |
dimitrispie
|
fdb5d2b39f
|
Bug fixes
|
2023-04-23 18:29:00 +03:00 |
dimitrispie
|
53ce023035
|
Bug fixes
|
2023-04-23 18:23:45 +03:00 |
dimitrispie
|
4fa750b719
|
Bug fixes on monitor-update
|
2023-04-19 17:39:53 +03:00 |
dimitrispie
|
5247cb7115
|
Bug fix
|
2023-04-19 11:11:19 +03:00 |
Miriam Baglioni
|
efc4f6a658
|
[bulkTag] refactor to enrich each result single step
|
2023-04-18 17:39:31 +02:00 |
Serafeim Chatzopoulos
|
23f58a86f1
|
Change jar param in project impact indicators action
|
2023-04-18 12:26:01 +03:00 |
Miriam Baglioni
|
697a134504
|
-
|
2023-04-18 10:21:12 +02:00 |
Miriam Baglioni
|
6cc95c96a2
|
-
|
2023-04-18 09:53:11 +02:00 |
dimitrispie
|
25dafccc24
|
Merge branch 'hive' into beta
|
2023-04-12 11:36:59 +03:00 |
Claudio Atzori
|
a2dcb06daf
|
added eoscifguidelines in the result view; removed compute statistics statements
|
2023-04-11 10:43:32 +02:00 |
Serafeim Chatzopoulos
|
7256c8d3c7
|
Add script for aggregating impact indicators at the project level
|
2023-04-07 16:30:12 +03:00 |
dimitrispie
|
c85de8fa1f
|
-Added Technological University Dublin
-Added project_organization_contribution table
-Add Delft University of Technology
|
2023-04-07 09:22:59 +03:00 |
dimitrispie
|
9b41dff33c
|
Update step20-createMonitorDB.sql
Added Delft University of Technology
|
2023-04-07 09:21:38 +03:00 |
Miriam Baglioni
|
932d07d2dd
|
[bulkTag] added filtering for datasources in eosctag
|
2023-04-06 15:08:27 +02:00 |
Miriam Baglioni
|
287753417d
|
better implementation for the fix
|
2023-04-06 12:22:38 +02:00 |
Miriam Baglioni
|
b42abc9904
|
fixed issue on bulktagging for the advanced constraints
|
2023-04-06 12:15:00 +02:00 |
dimitrispie
|
91e18ac7f4
|
Added project_organization_contribution table
|
2023-04-06 10:53:11 +03:00 |
Miriam Baglioni
|
b25b401065
|
added test to verify the advconstraints to dth community. inserted some additional logs.
|
2023-04-05 12:18:39 +02:00 |
Claudio Atzori
|
864f4051d3
|
[graph cleaning] added missing case
|
2023-04-05 11:35:47 +02:00 |
Claudio Atzori
|
dead87917f
|
[graph cleaning] cleanup
|
2023-04-04 13:13:43 +02:00 |
Claudio Atzori
|
2a6ba29b64
|
[graph cleaning] unit tests & cleanup
|
2023-04-04 12:34:51 +02:00 |
dimitrispie
|
9e1335df4c
|
-Added Technological University Dublin
-Added project_organization_contribution table
|
2023-04-04 13:22:40 +03:00 |
Claudio Atzori
|
63b8bbc015
|
[graph to Solr] using dedicated sparkExecutorCores, sparkExecutorMemory, sparkDriverMemory in convert_to_xml
|
2023-03-24 13:43:20 +01:00 |
Claudio Atzori
|
b502f86523
|
fixed input path supplemented to GetDatasourceFromCountry; adjusted the various spark.sql.shuffle.partitions
|
2023-03-24 13:09:12 +01:00 |
Claudio Atzori
|
c07857fa37
|
[graph cleaning] unit tests & cleanup
|
2023-03-23 15:57:47 +01:00 |
Claudio Atzori
|
90e61a8aba
|
[graph cleaning] WIP: refactoring of the cleaning stages, unit tests
|
2023-03-23 15:03:26 +01:00 |
Claudio Atzori
|
308e10d102
|
serialising: 1. measures for all the entity types and 2. result level fulltext
|
2023-03-23 11:23:22 +01:00 |
Claudio Atzori
|
488d9a5eaa
|
[graph cleaning] WIP: refactoring of the cleaning stages, unit tests
|
2023-03-23 10:41:13 +01:00 |
dimitrispie
|
fad7fa4af8
|
Added Technological University Dublin
|
2023-03-22 09:44:00 +02:00 |
Serafeim Chatzopoulos
|
102aa5ab81
|
Add dependency to dhp-aggregation
|
2023-03-21 19:25:29 +02:00 |
Serafeim Chatzopoulos
|
3e8a4cf952
|
Rearrange resources folder structure
|
2023-03-21 18:25:55 +02:00 |
Serafeim Chatzopoulos
|
f992ecb657
|
Checkout BIP-Ranker during 'prepare-package' && add it in the oozie-package.tar.gz
|
2023-03-21 18:03:55 +02:00 |
Ilias Kanellos
|
9dc8f0f05f
|
Add ActionSet step
|
2023-03-21 16:14:15 +02:00 |
Claudio Atzori
|
4f5ba0ed52
|
[graph cleaning] WIP: refactoring of the cleaning stages, unit tests
|
2023-03-21 14:41:20 +01:00 |
Ilias Kanellos
|
b5c252865c
|
Add filtering based on citation source
|
2023-03-20 15:38:36 +02:00 |
Claudio Atzori
|
6d3d18d8b5
|
[graph cleaning] WIP: refactoring of the cleaning stages
|
2023-03-16 17:23:36 +01:00 |
dimitrispie
|
43b23a9bf3
|
Update step20-createMonitorDB.sql
Added Technological University Dublin
|
2023-03-15 09:57:12 +02:00 |
Serafeim Chatzopoulos
|
720fd19b39
|
Add dhp-impact-indicators workflow files
|
2023-03-14 19:28:27 +02:00 |
Serafeim Chatzopoulos
|
c6e39b7f33
|
Add dhp-impact-indicators
|
2023-03-14 18:50:54 +02:00 |
Claudio Atzori
|
518618f1a9
|
[graph cleaning] avoid to overwrite the subject class to 'keyword' for those with provenance 'subject:fos'
|
2023-03-14 15:22:47 +01:00 |
Claudio Atzori
|
41e00bcd07
|
[graph provision] avoid to parse again the XML records, apparently the escaped XML characters get unescaped invalidating the record
|
2023-03-13 15:19:49 +01:00 |
Claudio Atzori
|
24e2fd828b
|
code formatting
|
2023-03-08 21:17:08 +01:00 |
Claudio Atzori
|
e28d395e87
|
[aggregator graph] using dedicated path to sync claims, adjusted paths with wildcards
|
2023-03-08 21:16:52 +01:00 |
Claudio Atzori
|
5b8fd37314
|
[aggregator graph] using dedicated path to sync claims
|
2023-03-08 15:28:14 +01:00 |
Claudio Atzori
|
7fd89566c2
|
[aggregator graph] handle paths including wildcards
|
2023-03-08 12:43:00 +01:00 |
Miriam Baglioni
|
588aca5ce4
|
Merge pull request 'h2020classification' (#280) from h2020classification into beta
Reviewed-on: D-Net/dnet-hadoop#280
|
2023-03-03 09:29:10 +01:00 |
Claudio Atzori
|
8ec0d62d91
|
pre-group the records in each table before joning the contents from BETA and PROD together
|
2023-03-02 14:49:19 +01:00 |
Miriam Baglioni
|
0fff98a14c
|
[ECclassification] removed print
|
2023-03-02 11:46:57 +01:00 |
Miriam Baglioni
|
b0c2f7e526
|
[ECclassification] removed not needed resources
|
2023-03-02 11:44:48 +01:00 |
Miriam Baglioni
|
d4fc62c2f6
|
mergin with branch beta
|
2023-03-02 11:14:54 +01:00 |
Miriam Baglioni
|
de8ad1caef
|
[ECclassification] new implementation for the H2020 classification
|
2023-03-02 11:14:03 +01:00 |
Claudio Atzori
|
db9dad4aa7
|
[actionmanager] increased spark.sql.shuffle.partitions for publication, dataset, relation records
|
2023-03-02 09:11:37 +01:00 |
Miriam Baglioni
|
c1f9848953
|
[ECclassification] added new classes
|
2023-03-01 15:29:11 +01:00 |
Claudio Atzori
|
6f488547a7
|
ignore non processable records
|
2023-03-01 14:49:51 +01:00 |
Claudio Atzori
|
7d263f265e
|
adjusted logs
|
2023-03-01 11:58:07 +01:00 |
Claudio Atzori
|
16ad42e8f3
|
code formatting
|
2023-03-01 10:22:13 +01:00 |
Claudio Atzori
|
9c59dac859
|
followup changes reorganising the mdstore synchronisation mechanism
|
2023-03-01 10:16:20 +01:00 |
Miriam Baglioni
|
ad745c0aa3
|
[CrossrefFunderMapping] fixed issueson funder name
|
2023-02-28 14:58:27 +01:00 |
Miriam Baglioni
|
4f2df876cd
|
[ECclassification] new implementation first try
|
2023-02-28 14:44:00 +01:00 |
Claudio Atzori
|
2f7346e9cf
|
WIP monodirectional citations, Datacite
|
2023-02-28 13:30:51 +01:00 |
Claudio Atzori
|
0559d8b412
|
WIP monodirectional citations
|
2023-02-28 10:57:32 +01:00 |
Sandro La Bruzzo
|
69fa616490
|
removed wrong content
|
2023-02-28 10:27:38 +01:00 |
Sandro La Bruzzo
|
832a75d012
|
added mapping for crossref funder
|
2023-02-28 10:16:34 +01:00 |
Sandro La Bruzzo
|
78e51c182a
|
Added missing parametero to raw all workflow
|
2023-02-28 10:16:01 +01:00 |
Claudio Atzori
|
7aebedb43c
|
code formatting
|
2023-02-27 11:51:27 +01:00 |
Miriam Baglioni
|
80987801d7
|
[FoS] added check for null on level1 subject
|
2023-02-27 11:40:22 +01:00 |
Claudio Atzori
|
31e97c2a6b
|
[unresolved entities] updated oozie wf node labels
|
2023-02-27 11:38:29 +01:00 |
Miriam Baglioni
|
23112929e9
|
[FoS] changed the default separator from comma to tab to solve the issue in subject value split
|
2023-02-27 10:18:39 +01:00 |
Serafeim Chatzopoulos
|
0b5bf53b45
|
Remove unecessary indexed fields from Solr
|
2023-02-23 12:42:42 +02:00 |
dimitrispie
|
1547611246
|
Merge branch 'beta' into hive
|
2023-02-22 16:57:12 +02:00 |
Michele Artini
|
fddcf701e9
|
updated the order of the compatibilities
|
2023-02-22 12:07:09 +01:00 |
Claudio Atzori
|
0c1be41b30
|
code formatting
|
2023-02-22 10:15:25 +01:00 |
Claudio Atzori
|
99cd7761aa
|
cleanup of non necessary dhp-monitor-update workflow
|
2023-02-22 10:10:22 +01:00 |
Claudio Atzori
|
cd3a51a15f
|
Merge branch 'beta' into 8232-mdstore-synch-improve
|
2023-02-22 09:57:07 +01:00 |
Claudio Atzori
|
477a7c416f
|
Merge branch 'beta' into UsageCountOnProjectAndDatasource
|
2023-02-22 09:55:51 +01:00 |
Claudio Atzori
|
c20c1c9159
|
Merge pull request 'Added 4 institutions:' (#261) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#261
|
2023-02-22 09:53:45 +01:00 |
Miriam Baglioni
|
d617c3e812
|
[DOIBoost] extended mapping for funder #8407
|
2023-02-20 14:45:27 +01:00 |
dimitrispie
|
90807b60c7
|
Changes to monitor wf
|
2023-02-20 10:42:24 +02:00 |
dimitrispie
|
d2f9ccf934
|
Changes to separate monitor wf
|
2023-02-20 10:41:21 +02:00 |
dimitrispie
|
032a401cbf
|
Bug fixes
|
2023-02-20 09:29:20 +02:00 |
Miriam Baglioni
|
016337a0f9
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-02-16 15:54:59 +01:00 |
Sandro La Bruzzo
|
118c1fc3b3
|
Merge remote-tracking branch 'origin/beta' into beta
|
2023-02-15 10:29:28 +01:00 |
Sandro La Bruzzo
|
a8ac79fa25
|
Added citation relation on crossref Mapping
|
2023-02-15 10:29:13 +01:00 |
dimitrispie
|
595192d510
|
Bug fix
|
2023-02-14 16:24:08 +02:00 |
dimitrispie
|
f3aaff3688
|
Remove duplicate orgs
|
2023-02-14 09:48:36 +02:00 |
Claudio Atzori
|
9a03f71db1
|
code formatting
|
2023-02-13 16:25:47 +01:00 |
Michele Artini
|
554df257ab
|
null values in date range conditions
|
2023-02-13 16:15:32 +01:00 |
dimitrispie
|
3400133c2f
|
Bug fix
|
2023-02-13 09:44:00 +02:00 |
dimitrispie
|
935db0ab25
|
Added organizations for Monitor
|
2023-02-13 09:29:09 +02:00 |
dimitrispie
|
7b78b15c81
|
Changes for copying to Impala Cluster
|
2023-02-13 09:27:00 +02:00 |
Miriam Baglioni
|
5cf902a2b0
|
[UsageCount] changed query to make the sum be computed via sql instead of grouping
|
2023-02-10 16:16:37 +01:00 |
Miriam Baglioni
|
f803530df6
|
[UsageCount] fixed query
|
2023-02-10 15:50:56 +01:00 |
Miriam Baglioni
|
bb5bba51b3
|
[UsageCount] extended test
|
2023-02-09 19:08:30 +01:00 |
Miriam Baglioni
|
85e53fad00
|
[UsageCount] addition of usagecount for Projects and datasources. Extention of the action set created for the results with new entities for projects and datasources. Extention of the resource set and modification of the testing class
|
2023-02-09 18:59:45 +01:00 |
dimitrispie
|
d71f5672d3
|
Add monitor post step
|
2023-02-09 13:44:14 +02:00 |
dimitrispie
|
35ba8bb328
|
Bug fixes
|
2023-02-09 12:57:57 +02:00 |
Sandro La Bruzzo
|
8920932dd8
|
Code formatted
|
2023-02-08 11:34:18 +01:00 |
Sandro La Bruzzo
|
0b9819f1ab
|
Code formatted
|
2023-02-08 10:32:33 +01:00 |
Sandro La Bruzzo
|
6c81a161d2
|
Merge remote-tracking branch 'origin/beta' into 8231-mdstore-synch-improve
|
2023-02-08 10:29:09 +01:00 |
dimitrispie
|
3ba11d64a1
|
Changes 07022023
|
2023-02-07 12:53:51 +02:00 |
dimitrispie
|
98c34263ed
|
Update step20-createMonitorDB.sql
Add University of Cape Town organization
|
2023-02-07 08:14:48 +02:00 |
dimitrispie
|
2dc6d47270
|
Changes 06022023
|
2023-02-06 13:18:53 +02:00 |
dimitrispie
|
973d78a4d6
|
Update step15_5.sql
Added unpaywalls open access colors
|
2023-02-02 08:03:54 +02:00 |
Claudio Atzori
|
d05ca53a14
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-01-31 14:39:53 +01:00 |
Miriam Baglioni
|
e82e009b46
|
added missing close tag for XML produced by the xquery to get information for the community from the IS
|
2023-01-31 10:19:34 +01:00 |
Miriam Baglioni
|
b254a0375f
|
[Affiliation from institutionalrepo] changed the field to check to verify the datasource type. Now it is in the field jurisdiction
|
2023-01-26 16:51:20 +01:00 |
dimitrispie
|
cf58e4a5e4
|
Added Arts et Métiers ParisTech
|
2023-01-25 16:03:16 +02:00 |
dimitrispie
|
db7d625ba9
|
Addedd Arts et Métiers ParisTech organization
|
2023-01-25 12:22:21 +02:00 |
Claudio Atzori
|
505867bce9
|
[bulk tagging] better node naming
|
2023-01-20 16:13:16 +01:00 |
Miriam Baglioni
|
ecd398fe51
|
refactoring
|
2023-01-20 14:23:45 +01:00 |
Miriam Baglioni
|
0a5c6010b0
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-01-13 16:14:46 +01:00 |
dimitrispie
|
4d7553c9f1
|
Bug fixes
|
2023-01-12 17:19:19 +02:00 |
dimitrispie
|
dd70c32ad7
|
Bug fixes
|
2023-01-12 17:18:05 +02:00 |
dimitrispie
|
51f7ab5864
|
Bug fixes
|
2023-01-12 17:15:06 +02:00 |
dimitrispie
|
34d4bf727c
|
Bug fixes
|
2023-01-12 11:28:37 +02:00 |
dimitrispie
|
43f6d4f296
|
-Monitor DB workflow
|
2023-01-12 11:26:47 +02:00 |
dimitrispie
|
686580a220
|
- New Monitor DB workflow
- New Organization added
|
2023-01-12 11:18:03 +02:00 |
Claudio Atzori
|
0a58bc7ba7
|
[broker] prevent NPEs
|
2023-01-11 14:44:14 +01:00 |
Claudio Atzori
|
04cb96001c
|
[broker] d40e20f437 adapted to the beta graph model
|
2023-01-11 10:10:12 +01:00 |
Michele Artini
|
91b845f611
|
Considering instance pids and alteternative identifiers
|
2023-01-11 09:58:54 +01:00 |
Miriam Baglioni
|
1f367122e4
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-01-11 09:47:44 +01:00 |
Michele Artini
|
7b7520850b
|
fixed an invalid char
|
2023-01-11 09:22:18 +01:00 |
Miriam Baglioni
|
d6895f0387
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-01-09 17:28:38 +01:00 |
dimitrispie
|
becb242c17
|
Monitor DB only Workflow
|
2023-01-04 16:50:29 +02:00 |
dimitrispie
|
dcb958e146
|
Changes to execute the stats wf only in hive
|
2023-01-04 11:39:01 +02:00 |
dimitrispie
|
592013d5dd
|
Added more steps in decision node
|
2022-12-23 09:43:16 +02:00 |
dimitrispie
|
2a4bf32d4c
|
Merge branch 'hive' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into hive
# Conflicts:
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step10.sql
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step13.sql
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step14.sql
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step16_1-definitions.sql
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step7.sql
|
2022-12-22 10:22:46 +02:00 |
dimitrispie
|
6449ff4207
|
1. Added a decision node to enables the workflow to make a selection on the execution path to follow
2. Added new organization
3. Added 5 new tables from Eurostast
|
2022-12-22 10:18:21 +02:00 |
Miriam Baglioni
|
8893389895
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-12-21 12:42:27 +01:00 |
Antonis Lempesis
|
c8309fe18e
|
addded command line params to allow hive actions to run
|
2022-12-21 12:41:33 +02:00 |
Antonis Lempesis
|
028873cc51
|
added new hive opts
|
2022-12-21 12:41:33 +02:00 |
Antonis Lempesis
|
1ddea4f442
|
removed 'stored as parquet' from views..
|
2022-12-21 12:41:33 +02:00 |
Antonis Lempesis
|
2754c3dd62
|
moving data to impala cluster and creating shadow databases there
|
2022-12-21 12:41:29 +02:00 |
Antonis Lempesis
|
778a1a724f
|
finished migration to hive only
|
2022-12-21 12:41:25 +02:00 |
Antonis Lempesis
|
e84dd5fe26
|
first
|
2022-12-21 12:41:23 +02:00 |
Sandro La Bruzzo
|
3c9826f186
|
updated lines function to it's implementation linesWithSeparators.map(l => l.stripLineEnd) in this way we force scala plugin compiler to consider this pipeline scala code and not java.string.lines() pipeline
|
2022-12-21 11:21:17 +01:00 |
Claudio Atzori
|
6aa91204a5
|
[orcid propagation] skip empty directories
|
2022-12-20 14:15:46 +01:00 |
Miriam Baglioni
|
6674cccb94
|
[BulkTag] description of parameters more comprehensive for those who do not implement it
|
2022-12-16 15:33:20 +01:00 |
Miriam Baglioni
|
f37113a941
|
[BulkTag] moving xquery to get community configuration in dedicated file
|
2022-12-16 15:32:26 +01:00 |
Miriam Baglioni
|
8685eaa706
|
[Clean Country] added test to verify remove of country
|
2022-12-16 15:31:25 +01:00 |
Miriam Baglioni
|
dc0ec88a58
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-12-16 13:18:32 +01:00 |
Miriam Baglioni
|
d791840b82
|
[Clean Country] added test to verify remove of country:
|
2022-12-16 13:18:29 +01:00 |
Claudio Atzori
|
7b80b24f82
|
[cleaning] country cleaning must use both PID and AlternateIdentifier fields
|
2022-12-15 14:49:04 +01:00 |
Claudio Atzori
|
b8bafab8a0
|
[cleaning] improved vocabulary based mapping, specialization for the strict vocab cleaning
|
2022-12-12 14:43:03 +01:00 |
Sandro La Bruzzo
|
5e4866d033
|
implemented synch for single mdstore
|
2022-12-12 11:29:46 +01:00 |
Claudio Atzori
|
c18b8048c3
|
[cleaning] avoid NPE
|
2022-12-10 11:41:38 +01:00 |
Claudio Atzori
|
8b44afe5e5
|
[cleaning] avoid NPE
|
2022-12-09 15:44:57 +01:00 |
Claudio Atzori
|
389dd25430
|
[cleaning] avoid NPE
|
2022-12-08 18:40:48 +01:00 |
Claudio Atzori
|
730228d73d
|
[cleaning] align wf parameter names in test
|
2022-12-08 18:40:22 +01:00 |
Claudio Atzori
|
2094fa6db0
|
[cleaning] align wf parameter names
|
2022-12-08 17:22:26 +01:00 |
Miriam Baglioni
|
a485a94956
|
[Cleaning] fixed parameter name in property file
|
2022-12-08 16:59:34 +01:00 |
Miriam Baglioni
|
3d99b78d94
|
[Cleaning] fixed error in parameter (workingPath to workingDir)
|
2022-12-08 10:25:02 +01:00 |
Claudio Atzori
|
1b8488976b
|
code formatting
|
2022-12-07 10:45:38 +01:00 |
Claudio Atzori
|
cd1b58483e
|
[bulk tag] fixed Community configuration parsing to void NPE
|
2022-12-07 10:39:00 +01:00 |
Claudio Atzori
|
062abfd669
|
fixed NPE, removed unused stuff
|
2022-12-06 12:04:00 +01:00 |
dimitrispie
|
2a52a42169
|
Added 4 institutions:
-University of Modena and Reggio Emilia
-Bilkent University
-Saints Cyril and Methodius University of Skopje
-University of Milan
|
2022-12-06 10:10:21 +02:00 |
Claudio Atzori
|
8248da40d9
|
Merge branch 'beta' into graph_cleaning
|
2022-12-02 14:49:00 +01:00 |
Claudio Atzori
|
ddf065756f
|
Merge pull request 'Two organizations are added for monitor' (#258) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#258
|
2022-12-02 14:45:27 +01:00 |
Sandro La Bruzzo
|
5a48a2fb18
|
implemented synch for single mdstore
|
2022-12-01 11:34:43 +01:00 |
Claudio Atzori
|
a38116546d
|
Merge branch 'beta' into deduptesting
|
2022-11-30 11:27:29 +01:00 |
Miriam Baglioni
|
ce020f2c83
|
[EOSC FUTURE] added resources and test for review
|
2022-11-30 09:57:30 +01:00 |
Miriam Baglioni
|
bb0ddc1c44
|
[BulkTag] adding verb starts_with
|
2022-11-30 09:56:24 +01:00 |
Claudio Atzori
|
8e3edba318
|
[graph cleaning] testing the collectedfron and hostedby patch procedure
|
2022-11-29 16:07:09 +01:00 |
Claudio Atzori
|
58c05731f9
|
[graph cleaning] WIP: testing the collectedfron and hostedby patch procedure
|
2022-11-29 11:21:51 +01:00 |
Miriam Baglioni
|
9c70c5dbd6
|
[Bulk Tag horizontal] added new path in definition of constraint (to recognize fos subjects) - changed test and resource class to test this new aspect
|
2022-11-28 14:51:20 +01:00 |
Miriam Baglioni
|
0628df7a3a
|
resolving conflicts
|
2022-11-28 10:44:56 +01:00 |
Claudio Atzori
|
11695ba649
|
[graph cleaning] patch also the result's collectedfrom and hostedby datasource name according to the datasource master-duplicate mapping
|
2022-11-28 10:18:43 +01:00 |
Claudio Atzori
|
6082d235d3
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into graph_cleaning
|
2022-11-28 09:54:48 +01:00 |
Claudio Atzori
|
24ef301cc1
|
[graph cleaning] patch the result's collectedfrom and hostedby identifiers according to the datasource master-duplicate mapping
|
2022-11-28 09:54:18 +01:00 |
Alessia Bardi
|
90c8f9cb61
|
tests for EOSC Future
|
2022-11-23 12:18:44 +01:00 |
Miriam Baglioni
|
0e3edc5018
|
[Bulk Tag] fixed issue in verb name
|
2022-11-23 11:26:36 +01:00 |
Claudio Atzori
|
a79c47522d
|
updated ORCID datasource identifier
|
2022-11-23 10:17:49 +01:00 |
Alessia Bardi
|
2832117f23
|
added eoscifguidelines in test
|
2022-11-22 18:01:12 +01:00 |
Alessia Bardi
|
3c08269a4d
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-11-22 17:31:00 +01:00 |
Alessia Bardi
|
2687fc9f73
|
tests for EOSC Future review - ROhub
|
2022-11-22 17:30:56 +01:00 |
Claudio Atzori
|
1d5143b0b6
|
Merge branch 'beta' into deduptesting
|
2022-11-22 10:21:30 +01:00 |
Claudio Atzori
|
0aa725083f
|
extended dedup testing
|
2022-11-17 16:13:43 +01:00 |
Claudio Atzori
|
3dbc637d3e
|
code formatting
|
2022-11-17 09:55:41 +01:00 |
Claudio Atzori
|
ddff0e8999
|
merging duplicates using IdentifierComparator
|
2022-11-11 16:10:25 +01:00 |
Claudio Atzori
|
5af5a8ae42
|
added IdentifierComparator
|
2022-11-09 14:20:59 +01:00 |
Claudio Atzori
|
7c3390ac10
|
Merge branch 'beta' into eoscifguidelines-from-mdstores
|
2022-11-07 12:18:40 +01:00 |
dimitrispie
|
992fc5b628
|
Added McMaster University Institution
|
2022-11-03 11:02:18 +02:00 |
dimitrispie
|
7fda05e380
|
Added Autonomous University of Barcelona
|
2022-11-01 13:59:40 +02:00 |
Claudio Atzori
|
22873c9172
|
Merge pull request 'Added fields: totalcost, fundedamount, currency, in project table' (#257) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#257
|
2022-10-31 13:49:27 +01:00 |
dimitrispie
|
7861c472e0
|
Hive memory parameters
|
2022-10-28 19:00:32 +03:00 |
dimitrispie
|
5df9c63963
|
Added fields: totalcost, fundedamount, currency, in project table
|
2022-10-27 16:44:26 +03:00 |
Sandro La Bruzzo
|
2b9a20a4a3
|
Changed the way Scholexplorer filter the relationships, I found that filter all relation coming from openCitation is wrong, because we loose a lot of relation than intersect OpenCitation, but they don't come only from there
|
2022-10-24 12:53:47 +02:00 |
Alessia Bardi
|
208ed32315
|
fixed xpath for semantic relation
|
2022-10-23 18:18:13 +02:00 |
Alessia Bardi
|
ee759ac92d
|
file format after mvn compile
|
2022-10-23 18:09:47 +02:00 |
Alessia Bardi
|
31a10f000b
|
Map the field oaf:eoscifguidelines from mdstores. Currently we can find it in ROHub metadata
|
2022-10-23 18:05:37 +02:00 |
Claudio Atzori
|
ec39b84898
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-10-19 15:21:02 +02:00 |
Claudio Atzori
|
bca4a61710
|
suppressing hyper verbose spark logs during unit test execution
|
2022-10-19 15:20:58 +02:00 |
Sandro La Bruzzo
|
72f0d88d6c
|
formatted code
|
2022-10-19 14:18:42 +02:00 |
Claudio Atzori
|
9b449110c6
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-10-14 15:48:04 +02:00 |
Claudio Atzori
|
ae7cd0735a
|
[graph2hive] more partitions
|
2022-10-14 15:47:58 +02:00 |
Sandro La Bruzzo
|
135cf81151
|
Merge remote-tracking branch 'origin/beta' into beta
|
2022-10-13 11:47:25 +02:00 |
Sandro La Bruzzo
|
a1f94530a3
|
added documentation
|
2022-10-13 11:47:11 +02:00 |
Claudio Atzori
|
b47aaf4dd1
|
[cleaning] subjects declared as belonging to specific vocabularies whose values are not found in the vocab are set to type keyword
|
2022-10-13 11:23:43 +02:00 |
Claudio Atzori
|
6163ecbf63
|
[cleaning] renamed parameters in wf action
|
2022-10-11 11:20:03 +02:00 |
Claudio Atzori
|
b301e9fdff
|
[cleaning] renamed action name/description
|
2022-10-11 11:08:52 +02:00 |
Claudio Atzori
|
ece40adc09
|
[cleaning] fixing NPE in the country cleaning phase
|
2022-10-11 10:10:20 +02:00 |
Claudio Atzori
|
d51275a965
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-10-07 09:52:49 +02:00 |
Claudio Atzori
|
8d97949316
|
[cleaning] fixed loop in wf nodes
|
2022-10-07 09:52:45 +02:00 |
Miriam Baglioni
|
a653e1b3ea
|
[Enrichment - result to community through organization] reimplementation of the data preparation step using spark
|
2022-10-04 15:01:28 +02:00 |
Miriam Baglioni
|
4d8339614b
|
Revert "[BipFinder] Fixed issue for wrong escaped char in doi"
This reverts commit 188f25eefa .
|
2022-10-04 14:29:47 +02:00 |
Miriam Baglioni
|
7324853a17
|
Revert "[BipFinder] refactoring"
This reverts commit 28dc317350 .
|
2022-10-04 14:29:39 +02:00 |
Miriam Baglioni
|
28dc317350
|
[BipFinder] refactoring
|
2022-10-04 09:47:27 +02:00 |
Miriam Baglioni
|
188f25eefa
|
[BipFinder] Fixed issue for wrong escaped char in doi
|
2022-10-03 12:42:52 +02:00 |
Claudio Atzori
|
89f7007080
|
Merge pull request '[stats wf] misc changes' (#254) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#254
|
2022-10-03 10:32:05 +02:00 |
dimitrispie
|
2c0c3f1806
|
Cast amount to float for table result_apcs
|
2022-09-28 19:33:24 +03:00 |
Alessia Bardi
|
49360770d7
|
map w3id as instance url
|
2022-09-28 14:16:39 +02:00 |
dimitrispie
|
bdc46e3eaa
|
Remove denormalization of results to fix downloads numbers in monitor
|
2022-09-28 14:59:08 +03:00 |
dimitrispie
|
2ebb1459a9
|
Fixed type in no_downloads
|
2022-09-28 14:36:57 +03:00 |
Miriam Baglioni
|
b5b5a4c192
|
[CleanCountry] fixed issue
|
2022-09-28 12:42:51 +02:00 |
Miriam Baglioni
|
f1d7d45cf7
|
[BulkTag] fixed issue
|
2022-09-28 12:01:43 +02:00 |
Miriam Baglioni
|
3ec044600d
|
[BulkTag] fixed conflicts
|
2022-09-28 11:58:28 +02:00 |
Miriam Baglioni
|
1cb79719a7
|
[BulkTag] fixed issues
|
2022-09-28 11:44:55 +02:00 |
Claudio Atzori
|
f3f7604e6c
|
trying to fix a test that fails only on Jenkins
|
2022-09-27 15:21:37 +02:00 |
Claudio Atzori
|
3f90d159e3
|
code formatting
|
2022-09-27 15:08:00 +02:00 |
Claudio Atzori
|
0b3e44e521
|
Merge branch 'beta' into relation-from-odf
|
2022-09-27 14:57:01 +02:00 |
Claudio Atzori
|
57dbeb08d2
|
code formatting
|
2022-09-27 14:55:10 +02:00 |
Claudio Atzori
|
b60985cf68
|
Merge branch 'beta' into horizontalConstraints
|
2022-09-27 14:39:31 +02:00 |
Claudio Atzori
|
3b60642ef9
|
Merge pull request 'Synchronize indicators in stats-db with monitor-db' (#249) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#249
|
2022-09-27 14:37:33 +02:00 |
Claudio Atzori
|
25e9d92aad
|
Merge branch 'beta' into clean_country
|
2022-09-27 14:27:49 +02:00 |
Alessia Bardi
|
fd63e9bfac
|
Mapping all relationships supported in ModelConstants and ModelSupport
|
2022-09-26 11:24:13 +02:00 |
Miriam Baglioni
|
ca216a92ad
|
[BulkTagging] changed the query to the IS to insert values for FOS and SDG as subject in the configuration used for the tagging
|
2022-09-23 17:06:07 +02:00 |
Miriam Baglioni
|
3e6b0f58bb
|
[BulkTagging] changed the query to the IS to get also the information for the advancedConstraint from the profile
|
2022-09-23 16:47:19 +02:00 |
Miriam Baglioni
|
4a3e119b73
|
mergin with branch beta
|
2022-09-23 16:16:06 +02:00 |
Miriam Baglioni
|
f0e303abf9
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-09-23 16:15:32 +02:00 |
Miriam Baglioni
|
55da4d8715
|
[BulkTagging] modifying code to represent constraints horizontally on all the results. Added subject to the set of field used to express the constraint. Modified resorces to test the new approach. Modified test calss
|
2022-09-23 16:02:19 +02:00 |
Alessia Bardi
|
c5eb722170
|
relationships from relatedIdentifier whose target id type is one of the pid type with an authority
|
2022-09-23 15:47:05 +02:00 |
Claudio Atzori
|
c86cc53520
|
suppressing hyper verbose spark logs during unit test execution
|
2022-09-23 15:20:40 +02:00 |
Alessia Bardi
|
ba33ff71fd
|
refactoring for the generation of relationships from related identifier of type 'OPENAIRE'
|
2022-09-23 15:17:13 +02:00 |
Alessia Bardi
|
982bcc1e35
|
test wrid pid and record identifier
|
2022-09-23 12:06:06 +02:00 |
Miriam Baglioni
|
960cb861a0
|
refactoring
|
2022-09-23 11:14:04 +02:00 |
Claudio Atzori
|
c42850328e
|
fixed semantic (subreltype) for ServiceOrganization relations
|
2022-09-22 16:23:25 +02:00 |
Miriam Baglioni
|
33bb79459e
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-09-22 15:55:17 +02:00 |
dimitrispie
|
dcd85f8cd7
|
- Synchronize indicators in stats-db with monitor-db
- added new openorg id for Nanyang Technological University
- changed openorg id for University of Helsinki #8088 ticket
|
2022-09-22 13:33:07 +03:00 |
Claudio Atzori
|
e45ec15221
|
Merge branch 'beta' into clean_country
|
2022-09-19 11:34:02 +02:00 |
Claudio Atzori
|
26e1badded
|
added instance.url syntactical validation, avoid creating multiple duplicated URLs
|
2022-09-19 11:19:10 +02:00 |
Miriam Baglioni
|
5240ac3d7b
|
[EOSC Tag] remove addition of eosc context for result with eosc if guidelines set
|
2022-09-19 11:02:18 +02:00 |
Claudio Atzori
|
192215a18e
|
merged from branch discard-non-wellformed
|
2022-09-19 10:17:10 +02:00 |
Claudio Atzori
|
e370e940d8
|
[aggregator graph] save invalid records aside for further inspection
|
2022-09-16 14:06:28 +02:00 |
Claudio Atzori
|
465e941214
|
Merge pull request '[stats wf] Changes to indicators tables' (#244) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#244
|
2022-09-16 10:13:58 +02:00 |
Claudio Atzori
|
1e42d984e1
|
[aggregator graph] save invalid records aside for further inspection
|
2022-09-15 10:49:42 +02:00 |
Alessia Bardi
|
9e7ec4198f
|
fixed test
|
2022-09-14 18:08:56 +02:00 |
Claudio Atzori
|
c48f6e9c57
|
[aggregator graph] save invalid records aside for further inspection
|
2022-09-14 17:11:26 +02:00 |
dimitrispie
|
3bf3127251
|
Changes to monitor and indicator scripts
|
2022-09-14 16:36:19 +03:00 |
Claudio Atzori
|
a0919ed495
|
[aggregator graph] save invalid records aside for further inspection
|
2022-09-14 13:27:39 +02:00 |
Alessia Bardi
|
b99a011345
|
return empty Oaf list if record cannot be parsed
|
2022-09-13 11:51:55 +02:00 |
Alessia Bardi
|
27af5122d2
|
logs for non well formed XML files
|
2022-09-12 14:25:23 +02:00 |
Claudio Atzori
|
ff6f789b6d
|
code formatting
|
2022-09-09 15:16:31 +02:00 |
Claudio Atzori
|
b5d6966c01
|
Merge branch 'beta' into clean_country
|
2022-09-09 12:20:19 +02:00 |
Claudio Atzori
|
b5f7bd30be
|
Merge branch 'beta' into clean_subjects
|
2022-09-09 12:20:04 +02:00 |
Alessia Bardi
|
f14107ad77
|
Merge branch 'handle_as_instance_urls' of https://code-repo.d4science.org/D-Net/dnet-hadoop into handle_as_instance_urls
|
2022-09-09 12:17:19 +02:00 |
Alessia Bardi
|
a539c6ccaf
|
https for handle URLs
|
2022-09-09 12:16:28 +02:00 |
dimitrispie
|
71b069ca90
|
Changes to indicator and monitor scripts
|
2022-09-09 13:15:58 +03:00 |
Claudio Atzori
|
1203378441
|
Merge branch 'beta' into clean_subjects
|
2022-09-09 10:38:47 +02:00 |
Claudio Atzori
|
14dc909a14
|
Merge branch 'beta' into clean_country
|
2022-09-09 10:38:17 +02:00 |
Claudio Atzori
|
853c996fa2
|
Merge branch 'beta' into handle_as_instance_urls
|
2022-09-09 09:47:16 +02:00 |
Claudio Atzori
|
a431e01383
|
Merge pull request 'orcid_multipleworks_download' (#242) from enrico.ottonello/dnet-hadoop:orcid_multipleworks_download into beta
Reviewed-on: D-Net/dnet-hadoop#242
|
2022-09-09 08:45:02 +02:00 |
Alessia Bardi
|
9ef063d502
|
#7861#note-8 instance url from handle
|
2022-09-07 17:29:54 +03:00 |
Alessia Bardi
|
5c45d52af3
|
testing for RiuNet
|
2022-09-07 15:40:57 +03:00 |
dimitrispie
|
2b5f8c9c9a
|
comment out duplicate table creation
|
2022-09-06 12:27:53 +03:00 |
Alessia Bardi
|
a11eb38065
|
testing for RO-Hub
|
2022-09-02 16:07:36 +02:00 |
Enrico Ottonello
|
bfdf2dc390
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid_multipleworks_download
|
2022-08-25 12:07:54 +02:00 |
Enrico Ottonello
|
da1cf561e6
|
alignment with beta
|
2022-08-25 11:57:20 +02:00 |
Enrico Ottonello
|
27445ccdaa
|
cleaned log
|
2022-08-25 11:56:14 +02:00 |
Claudio Atzori
|
b7c387c21f
|
cleaning of subjects: avoid duplicated subjects, prioritise collected vs inferred or other sources
|
2022-08-12 15:09:16 +02:00 |
Claudio Atzori
|
adb526b0e1
|
Merge branch 'beta' into clean_subjects
|
2022-08-12 10:51:17 +02:00 |
Claudio Atzori
|
cb7c07c54e
|
[scholix] added step to create tar archive
|
2022-08-11 11:25:24 +02:00 |
Claudio Atzori
|
2aa16d0432
|
[scholix] fixed OpenCitation dump procedure
|
2022-08-10 17:39:29 +02:00 |
Miriam Baglioni
|
7dbdd4a0fe
|
[Clean Country]changes related to D-Net/dnet-hadoop#241 (comment)
|
2022-08-10 15:13:10 +02:00 |
Claudio Atzori
|
51ad93e545
|
[scholix] fixed OpenCitation dump procedure
|
2022-08-10 11:57:56 +02:00 |
Miriam Baglioni
|
62d2138806
|
[Clean Context] changed a bit the logic. Added the check not to have result hosted by a datasource of type institutional repository from NL. Added also the check that the country should have been included in the result via propagation for it to be removed
|
2022-08-08 14:10:47 +02:00 |
Claudio Atzori
|
3418ce50ac
|
cleaning of subjects: perform the cleaning when the given value is equivalent to one of the terms in the vocabulary
|
2022-08-08 12:48:47 +02:00 |
Claudio Atzori
|
a78028dabc
|
Merge branch 'beta' into clean_subjects
|
2022-08-08 12:34:33 +02:00 |
Miriam Baglioni
|
390013a4b2
|
mergin with branch beta
|
2022-08-08 12:30:31 +02:00 |
Claudio Atzori
|
3937ff04de
|
Merge branch 'beta' into tagEosc
|
2022-08-08 09:57:23 +02:00 |
Claudio Atzori
|
a4815f6bec
|
Merge branch 'beta' into clean_subjects
|
2022-08-05 16:57:03 +02:00 |
Claudio Atzori
|
29c4cde42e
|
Merge branch 'clean_subjects' of https://code-repo.d4science.org/D-Net/dnet-hadoop into clean_subjects
|
2022-08-05 16:56:37 +02:00 |
Claudio Atzori
|
4eaa063b1f
|
cleaning of subjects
|
2022-08-05 16:56:09 +02:00 |
Claudio Atzori
|
84598c7535
|
Merge pull request 'restored some collab indicators' (#240) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#240
|
2022-08-05 15:50:39 +02:00 |
Antonis Lempesis
|
fcef5294e2
|
restored some collab indicators
|
2022-08-05 13:45:01 +03:00 |
Claudio Atzori
|
844f6eb465
|
Merge branch 'beta' into clean_subjects
|
2022-08-05 12:39:05 +02:00 |