Claudio Atzori
|
4a4ca634f0
|
Merge pull request 'advConstraintsInBeta' (#288) from advConstraintsInBeta into master
Reviewed-on: D-Net/dnet-hadoop#288
|
2023-04-06 15:24:23 +02:00 |
Miriam Baglioni
|
932d07d2dd
|
[bulkTag] added filtering for datasources in eosctag
|
2023-04-06 15:08:27 +02:00 |
Miriam Baglioni
|
c6a7602b3e
|
refactoring after compilation
|
2023-04-06 14:45:01 +02:00 |
Miriam Baglioni
|
831055a1fc
|
change of the property for test purposes, addition of two new verbs, and fix of issue for advanced constraints
|
2023-04-06 14:41:32 +02:00 |
Miriam Baglioni
|
287753417d
|
better implementation for the fix
|
2023-04-06 12:22:38 +02:00 |
Miriam Baglioni
|
cf3d0f4f83
|
fixed issue on bulktagging for the advanced constraints
|
2023-04-06 12:17:35 +02:00 |
Miriam Baglioni
|
b42abc9904
|
fixed issue on bulktagging for the advanced constraints
|
2023-04-06 12:15:00 +02:00 |
dimitrispie
|
91e18ac7f4
|
Added project_organization_contribution table
|
2023-04-06 10:53:11 +03:00 |
Claudio Atzori
|
4f67225fbc
|
Merge pull request 'doiboostMappingExtention' (#286) from doiboostMappingExtention into master
Reviewed-on: D-Net/dnet-hadoop#286
|
2023-04-06 09:25:08 +02:00 |
Claudio Atzori
|
e093f04874
|
Merge pull request 'AdvancedConstraint' (#285) from advConstraintsInBeta into master
Reviewed-on: D-Net/dnet-hadoop#285
|
2023-04-06 09:24:54 +02:00 |
Miriam Baglioni
|
c5a9f39141
|
Extended the association project - result in the mapping from CrossRef
|
2023-04-05 16:48:36 +02:00 |
Miriam Baglioni
|
ecc05fe0f3
|
Added the code for the advancedConstraint implementation during the bulkTagging
|
2023-04-05 16:40:29 +02:00 |
Miriam Baglioni
|
b25b401065
|
added test to verify the advconstraints to dth community. inserted some additional logs.
|
2023-04-05 12:18:39 +02:00 |
Claudio Atzori
|
864f4051d3
|
[graph cleaning] added missing case
|
2023-04-05 11:35:47 +02:00 |
Claudio Atzori
|
dead87917f
|
[graph cleaning] cleanup
|
2023-04-04 13:13:43 +02:00 |
Claudio Atzori
|
2a6ba29b64
|
[graph cleaning] unit tests & cleanup
|
2023-04-04 12:34:51 +02:00 |
dimitrispie
|
9e1335df4c
|
-Added Technological University Dublin
-Added project_organization_contribution table
|
2023-04-04 13:22:40 +03:00 |
Claudio Atzori
|
63b8bbc015
|
[graph to Solr] using dedicated sparkExecutorCores, sparkExecutorMemory, sparkDriverMemory in convert_to_xml
|
2023-03-24 13:43:20 +01:00 |
Claudio Atzori
|
b502f86523
|
fixed input path supplemented to GetDatasourceFromCountry; adjusted the various spark.sql.shuffle.partitions
|
2023-03-24 13:09:12 +01:00 |
Claudio Atzori
|
c07857fa37
|
[graph cleaning] unit tests & cleanup
|
2023-03-23 15:57:47 +01:00 |
Claudio Atzori
|
90e61a8aba
|
[graph cleaning] WIP: refactoring of the cleaning stages, unit tests
|
2023-03-23 15:03:26 +01:00 |
Claudio Atzori
|
308e10d102
|
serialising: 1. measures for all the entity types and 2. result level fulltext
|
2023-03-23 11:23:22 +01:00 |
Claudio Atzori
|
488d9a5eaa
|
[graph cleaning] WIP: refactoring of the cleaning stages, unit tests
|
2023-03-23 10:41:13 +01:00 |
dimitrispie
|
fad7fa4af8
|
Added Technological University Dublin
|
2023-03-22 09:44:00 +02:00 |
Serafeim Chatzopoulos
|
102aa5ab81
|
Add dependency to dhp-aggregation
|
2023-03-21 19:25:29 +02:00 |
Serafeim Chatzopoulos
|
3e8a4cf952
|
Rearrange resources folder structure
|
2023-03-21 18:25:55 +02:00 |
Serafeim Chatzopoulos
|
f992ecb657
|
Checkout BIP-Ranker during 'prepare-package' && add it in the oozie-package.tar.gz
|
2023-03-21 18:03:55 +02:00 |
Ilias Kanellos
|
9dc8f0f05f
|
Add ActionSet step
|
2023-03-21 16:14:15 +02:00 |
Claudio Atzori
|
4f5ba0ed52
|
[graph cleaning] WIP: refactoring of the cleaning stages, unit tests
|
2023-03-21 14:41:20 +01:00 |
Ilias Kanellos
|
b5c252865c
|
Add filtering based on citation source
|
2023-03-20 15:38:36 +02:00 |
Claudio Atzori
|
6d3d18d8b5
|
[graph cleaning] WIP: refactoring of the cleaning stages
|
2023-03-16 17:23:36 +01:00 |
dimitrispie
|
43b23a9bf3
|
Update step20-createMonitorDB.sql
Added Technological University Dublin
|
2023-03-15 09:57:12 +02:00 |
Serafeim Chatzopoulos
|
720fd19b39
|
Add dhp-impact-indicators workflow files
|
2023-03-14 19:28:27 +02:00 |
Serafeim Chatzopoulos
|
c6e39b7f33
|
Add dhp-impact-indicators
|
2023-03-14 18:50:54 +02:00 |
Claudio Atzori
|
518618f1a9
|
[graph cleaning] avoid to overwrite the subject class to 'keyword' for those with provenance 'subject:fos'
|
2023-03-14 15:22:47 +01:00 |
Claudio Atzori
|
41e00bcd07
|
[graph provision] avoid to parse again the XML records, apparently the escaped XML characters get unescaped invalidating the record
|
2023-03-13 15:19:49 +01:00 |
Claudio Atzori
|
24e2fd828b
|
code formatting
|
2023-03-08 21:17:08 +01:00 |
Claudio Atzori
|
e28d395e87
|
[aggregator graph] using dedicated path to sync claims, adjusted paths with wildcards
|
2023-03-08 21:16:52 +01:00 |
Claudio Atzori
|
5b8fd37314
|
[aggregator graph] using dedicated path to sync claims
|
2023-03-08 15:28:14 +01:00 |
Claudio Atzori
|
7fd89566c2
|
[aggregator graph] handle paths including wildcards
|
2023-03-08 12:43:00 +01:00 |
Miriam Baglioni
|
588aca5ce4
|
Merge pull request 'h2020classification' (#280) from h2020classification into beta
Reviewed-on: D-Net/dnet-hadoop#280
|
2023-03-03 09:29:10 +01:00 |
Claudio Atzori
|
8ec0d62d91
|
pre-group the records in each table before joning the contents from BETA and PROD together
|
2023-03-02 14:49:19 +01:00 |
Miriam Baglioni
|
0fff98a14c
|
[ECclassification] removed print
|
2023-03-02 11:46:57 +01:00 |
Miriam Baglioni
|
b0c2f7e526
|
[ECclassification] removed not needed resources
|
2023-03-02 11:44:48 +01:00 |
Miriam Baglioni
|
d4fc62c2f6
|
mergin with branch beta
|
2023-03-02 11:14:54 +01:00 |
Miriam Baglioni
|
de8ad1caef
|
[ECclassification] new implementation for the H2020 classification
|
2023-03-02 11:14:03 +01:00 |
Claudio Atzori
|
db9dad4aa7
|
[actionmanager] increased spark.sql.shuffle.partitions for publication, dataset, relation records
|
2023-03-02 09:11:37 +01:00 |
Miriam Baglioni
|
c1f9848953
|
[ECclassification] added new classes
|
2023-03-01 15:29:11 +01:00 |
Claudio Atzori
|
6f488547a7
|
ignore non processable records
|
2023-03-01 14:49:51 +01:00 |
Claudio Atzori
|
7d263f265e
|
adjusted logs
|
2023-03-01 11:58:07 +01:00 |
Claudio Atzori
|
16ad42e8f3
|
code formatting
|
2023-03-01 10:22:13 +01:00 |
Claudio Atzori
|
9c59dac859
|
followup changes reorganising the mdstore synchronisation mechanism
|
2023-03-01 10:16:20 +01:00 |
Miriam Baglioni
|
ad745c0aa3
|
[CrossrefFunderMapping] fixed issueson funder name
|
2023-02-28 14:58:27 +01:00 |
Miriam Baglioni
|
4f2df876cd
|
[ECclassification] new implementation first try
|
2023-02-28 14:44:00 +01:00 |
Claudio Atzori
|
2f7346e9cf
|
WIP monodirectional citations, Datacite
|
2023-02-28 13:30:51 +01:00 |
Claudio Atzori
|
0559d8b412
|
WIP monodirectional citations
|
2023-02-28 10:57:32 +01:00 |
Sandro La Bruzzo
|
69fa616490
|
removed wrong content
|
2023-02-28 10:27:38 +01:00 |
Sandro La Bruzzo
|
832a75d012
|
added mapping for crossref funder
|
2023-02-28 10:16:34 +01:00 |
Sandro La Bruzzo
|
78e51c182a
|
Added missing parametero to raw all workflow
|
2023-02-28 10:16:01 +01:00 |
Claudio Atzori
|
7aebedb43c
|
code formatting
|
2023-02-27 11:51:27 +01:00 |
Miriam Baglioni
|
80987801d7
|
[FoS] added check for null on level1 subject
|
2023-02-27 11:40:22 +01:00 |
Claudio Atzori
|
31e97c2a6b
|
[unresolved entities] updated oozie wf node labels
|
2023-02-27 11:38:29 +01:00 |
Miriam Baglioni
|
23112929e9
|
[FoS] changed the default separator from comma to tab to solve the issue in subject value split
|
2023-02-27 10:18:39 +01:00 |
Serafeim Chatzopoulos
|
0b5bf53b45
|
Remove unecessary indexed fields from Solr
|
2023-02-23 12:42:42 +02:00 |
dimitrispie
|
1547611246
|
Merge branch 'beta' into hive
|
2023-02-22 16:57:12 +02:00 |
Michele Artini
|
fddcf701e9
|
updated the order of the compatibilities
|
2023-02-22 12:07:09 +01:00 |
Michele Artini
|
200098b683
|
updated the order of the compatibilities
|
2023-02-22 11:52:59 +01:00 |
Claudio Atzori
|
0c1be41b30
|
code formatting
|
2023-02-22 10:15:25 +01:00 |
Claudio Atzori
|
99cd7761aa
|
cleanup of non necessary dhp-monitor-update workflow
|
2023-02-22 10:10:22 +01:00 |
Claudio Atzori
|
cd3a51a15f
|
Merge branch 'beta' into 8232-mdstore-synch-improve
|
2023-02-22 09:57:07 +01:00 |
Claudio Atzori
|
477a7c416f
|
Merge branch 'beta' into UsageCountOnProjectAndDatasource
|
2023-02-22 09:55:51 +01:00 |
Claudio Atzori
|
c20c1c9159
|
Merge pull request 'Added 4 institutions:' (#261) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#261
|
2023-02-22 09:53:45 +01:00 |
Miriam Baglioni
|
d617c3e812
|
[DOIBoost] extended mapping for funder #8407
|
2023-02-20 14:45:27 +01:00 |
dimitrispie
|
90807b60c7
|
Changes to monitor wf
|
2023-02-20 10:42:24 +02:00 |
dimitrispie
|
d2f9ccf934
|
Changes to separate monitor wf
|
2023-02-20 10:41:21 +02:00 |
dimitrispie
|
032a401cbf
|
Bug fixes
|
2023-02-20 09:29:20 +02:00 |
Miriam Baglioni
|
016337a0f9
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-02-16 15:54:59 +01:00 |
Sandro La Bruzzo
|
118c1fc3b3
|
Merge remote-tracking branch 'origin/beta' into beta
|
2023-02-15 10:29:28 +01:00 |
Sandro La Bruzzo
|
a8ac79fa25
|
Added citation relation on crossref Mapping
|
2023-02-15 10:29:13 +01:00 |
dimitrispie
|
595192d510
|
Bug fix
|
2023-02-14 16:24:08 +02:00 |
dimitrispie
|
f3aaff3688
|
Remove duplicate orgs
|
2023-02-14 09:48:36 +02:00 |
Claudio Atzori
|
9a03f71db1
|
code formatting
|
2023-02-13 16:25:47 +01:00 |
Michele Artini
|
554df257ab
|
null values in date range conditions
|
2023-02-13 16:15:32 +01:00 |
Michele Artini
|
9c1df15071
|
null values in date range conditions
|
2023-02-13 16:05:58 +01:00 |
Miriam Baglioni
|
32870339f5
|
refactoring after compile
|
2023-02-13 13:06:48 +01:00 |
Miriam Baglioni
|
7184cc0804
|
[FoS] added check for null on level1 subject
|
2023-02-13 13:03:49 +01:00 |
dimitrispie
|
3400133c2f
|
Bug fix
|
2023-02-13 09:44:00 +02:00 |
dimitrispie
|
935db0ab25
|
Added organizations for Monitor
|
2023-02-13 09:29:09 +02:00 |
dimitrispie
|
7b78b15c81
|
Changes for copying to Impala Cluster
|
2023-02-13 09:27:00 +02:00 |
Miriam Baglioni
|
5cf902a2b0
|
[UsageCount] changed query to make the sum be computed via sql instead of grouping
|
2023-02-10 16:16:37 +01:00 |
Miriam Baglioni
|
f803530df6
|
[UsageCount] fixed query
|
2023-02-10 15:50:56 +01:00 |
Miriam Baglioni
|
7473093c84
|
[FoS] changed the default separator from comma to tab to solve the issue in subject value split
|
2023-02-10 15:34:52 +01:00 |
Miriam Baglioni
|
bb5bba51b3
|
[UsageCount] extended test
|
2023-02-09 19:08:30 +01:00 |
Miriam Baglioni
|
85e53fad00
|
[UsageCount] addition of usagecount for Projects and datasources. Extention of the action set created for the results with new entities for projects and datasources. Extention of the resource set and modification of the testing class
|
2023-02-09 18:59:45 +01:00 |
dimitrispie
|
d71f5672d3
|
Add monitor post step
|
2023-02-09 13:44:14 +02:00 |
dimitrispie
|
35ba8bb328
|
Bug fixes
|
2023-02-09 12:57:57 +02:00 |
Sandro La Bruzzo
|
8920932dd8
|
Code formatted
|
2023-02-08 11:34:18 +01:00 |
Sandro La Bruzzo
|
0b9819f1ab
|
Code formatted
|
2023-02-08 10:32:33 +01:00 |
Sandro La Bruzzo
|
6c81a161d2
|
Merge remote-tracking branch 'origin/beta' into 8231-mdstore-synch-improve
|
2023-02-08 10:29:09 +01:00 |
dimitrispie
|
3ba11d64a1
|
Changes 07022023
|
2023-02-07 12:53:51 +02:00 |
dimitrispie
|
98c34263ed
|
Update step20-createMonitorDB.sql
Add University of Cape Town organization
|
2023-02-07 08:14:48 +02:00 |
dimitrispie
|
2dc6d47270
|
Changes 06022023
|
2023-02-06 13:18:53 +02:00 |
Miriam Baglioni
|
5f0906be60
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2023-02-02 17:13:14 +01:00 |
dimitrispie
|
973d78a4d6
|
Update step15_5.sql
Added unpaywalls open access colors
|
2023-02-02 08:03:54 +02:00 |
Claudio Atzori
|
d05ca53a14
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-01-31 14:39:53 +01:00 |
Miriam Baglioni
|
e82e009b46
|
added missing close tag for XML produced by the xquery to get information for the community from the IS
|
2023-01-31 10:19:34 +01:00 |
Miriam Baglioni
|
b254a0375f
|
[Affiliation from institutionalrepo] changed the field to check to verify the datasource type. Now it is in the field jurisdiction
|
2023-01-26 16:51:20 +01:00 |
dimitrispie
|
cf58e4a5e4
|
Added Arts et Métiers ParisTech
|
2023-01-25 16:03:16 +02:00 |
dimitrispie
|
db7d625ba9
|
Addedd Arts et Métiers ParisTech organization
|
2023-01-25 12:22:21 +02:00 |
Claudio Atzori
|
505867bce9
|
[bulk tagging] better node naming
|
2023-01-20 16:13:16 +01:00 |
Claudio Atzori
|
1b37516578
|
[bulk tagging] better node naming
|
2023-01-20 16:11:26 +01:00 |
Miriam Baglioni
|
ecd398fe51
|
refactoring
|
2023-01-20 14:23:45 +01:00 |
Claudio Atzori
|
c1e2460293
|
[cleaning] the datasource master-duplicate fixup should not be brought to production yet
|
2023-01-20 09:20:26 +01:00 |
Claudio Atzori
|
3800361033
|
[country propagation] fixes error 'cannot resolve countrySet given input columns: []' when there is no prepared information driving the propagation process for a given result type
|
2023-01-19 15:57:43 +01:00 |
Miriam Baglioni
|
0a5c6010b0
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-01-13 16:14:46 +01:00 |
dimitrispie
|
4d7553c9f1
|
Bug fixes
|
2023-01-12 17:19:19 +02:00 |
dimitrispie
|
dd70c32ad7
|
Bug fixes
|
2023-01-12 17:18:05 +02:00 |
dimitrispie
|
51f7ab5864
|
Bug fixes
|
2023-01-12 17:15:06 +02:00 |
dimitrispie
|
34d4bf727c
|
Bug fixes
|
2023-01-12 11:28:37 +02:00 |
dimitrispie
|
43f6d4f296
|
-Monitor DB workflow
|
2023-01-12 11:26:47 +02:00 |
dimitrispie
|
686580a220
|
- New Monitor DB workflow
- New Organization added
|
2023-01-12 11:18:03 +02:00 |
Claudio Atzori
|
0a58bc7ba7
|
[broker] prevent NPEs
|
2023-01-11 14:44:14 +01:00 |
Michele Artini
|
699736addc
|
NPE prevention
|
2023-01-11 13:14:44 +01:00 |
Claudio Atzori
|
04cb96001c
|
[broker] d40e20f437 adapted to the beta graph model
|
2023-01-11 10:10:12 +01:00 |
Michele Artini
|
91b845f611
|
Considering instance pids and alteternative identifiers
|
2023-01-11 09:58:54 +01:00 |
Claudio Atzori
|
f86e19b282
|
code formatting
|
2023-01-11 09:53:19 +01:00 |
Miriam Baglioni
|
1f367122e4
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-01-11 09:47:44 +01:00 |
Michele Artini
|
d40e20f437
|
Considering instance pids and alteternative identifiers
|
2023-01-11 09:37:34 +01:00 |
Michele Artini
|
7b7520850b
|
fixed an invalid char
|
2023-01-11 09:22:18 +01:00 |
Michele Artini
|
4953ae5649
|
fixed an invalid char
|
2023-01-11 08:35:53 +01:00 |
Miriam Baglioni
|
d6895f0387
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-01-09 17:28:38 +01:00 |
Miriam Baglioni
|
c60d3a2b46
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2023-01-09 17:28:27 +01:00 |
dimitrispie
|
becb242c17
|
Monitor DB only Workflow
|
2023-01-04 16:50:29 +02:00 |
dimitrispie
|
dcb958e146
|
Changes to execute the stats wf only in hive
|
2023-01-04 11:39:01 +02:00 |
dimitrispie
|
592013d5dd
|
Added more steps in decision node
|
2022-12-23 09:43:16 +02:00 |
dimitrispie
|
2a4bf32d4c
|
Merge branch 'hive' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into hive
# Conflicts:
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step10.sql
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step13.sql
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step14.sql
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step16_1-definitions.sql
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step7.sql
|
2022-12-22 10:22:46 +02:00 |
dimitrispie
|
6449ff4207
|
1. Added a decision node to enables the workflow to make a selection on the execution path to follow
2. Added new organization
3. Added 5 new tables from Eurostast
|
2022-12-22 10:18:21 +02:00 |
Miriam Baglioni
|
b713132db7
|
[Cleaning] adding missing classes
|
2022-12-21 12:49:08 +01:00 |
Miriam Baglioni
|
8893389895
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-12-21 12:42:27 +01:00 |
Miriam Baglioni
|
11f2b470d3
|
[Cleaning] adding missing classes
|
2022-12-21 12:42:19 +01:00 |
Antonis Lempesis
|
c8309fe18e
|
addded command line params to allow hive actions to run
|
2022-12-21 12:41:33 +02:00 |
Antonis Lempesis
|
028873cc51
|
added new hive opts
|
2022-12-21 12:41:33 +02:00 |
Antonis Lempesis
|
1ddea4f442
|
removed 'stored as parquet' from views..
|
2022-12-21 12:41:33 +02:00 |
Antonis Lempesis
|
2754c3dd62
|
moving data to impala cluster and creating shadow databases there
|
2022-12-21 12:41:29 +02:00 |
Antonis Lempesis
|
778a1a724f
|
finished migration to hive only
|
2022-12-21 12:41:25 +02:00 |
Antonis Lempesis
|
e84dd5fe26
|
first
|
2022-12-21 12:41:23 +02:00 |
Sandro La Bruzzo
|
3c9826f186
|
updated lines function to it's implementation linesWithSeparators.map(l => l.stripLineEnd) in this way we force scala plugin compiler to consider this pipeline scala code and not java.string.lines() pipeline
|
2022-12-21 11:21:17 +01:00 |
Sandro La Bruzzo
|
91c70b15a5
|
updated lines function to it's implementation linesWithSeparators.map(l => l.stripLineEnd) in this way we force scala plugin compiler to consider this pipeline scala code and not java.string.lines() pipeline
|
2022-12-21 11:14:42 +01:00 |
Claudio Atzori
|
f910b7379d
|
[cleaning] recovering missing resources from D-Net/dnet-hadoop#265
|
2022-12-21 09:26:34 +01:00 |
Claudio Atzori
|
33bdad104e
|
[cleaning] align parameter names
|
2022-12-20 21:43:59 +01:00 |
Claudio Atzori
|
6aa91204a5
|
[orcid propagation] skip empty directories
|
2022-12-20 14:15:46 +01:00 |
Claudio Atzori
|
5816ded93f
|
code formatting
|
2022-12-20 10:41:40 +01:00 |
Claudio Atzori
|
46972f8393
|
[orcid propagation] skip empty directory
|
2022-12-20 10:28:22 +01:00 |
Miriam Baglioni
|
059e100ec7
|
[Clean Country] moving other resources for testing purposes
|
2022-12-16 15:48:21 +01:00 |
Miriam Baglioni
|
fc95a550c3
|
[Clean Country] moving other resources for testing purposes
|
2022-12-16 15:46:32 +01:00 |
Miriam Baglioni
|
6901ac91b1
|
[Clean Country] moving source and resources to master
|
2022-12-16 15:42:49 +01:00 |
Miriam Baglioni
|
6674cccb94
|
[BulkTag] description of parameters more comprehensive for those who do not implement it
|
2022-12-16 15:33:20 +01:00 |
Miriam Baglioni
|
f37113a941
|
[BulkTag] moving xquery to get community configuration in dedicated file
|
2022-12-16 15:32:26 +01:00 |
Miriam Baglioni
|
8685eaa706
|
[Clean Country] added test to verify remove of country
|
2022-12-16 15:31:25 +01:00 |
Miriam Baglioni
|
dc0ec88a58
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-12-16 13:18:32 +01:00 |
Miriam Baglioni
|
d791840b82
|
[Clean Country] added test to verify remove of country:
|
2022-12-16 13:18:29 +01:00 |
Claudio Atzori
|
7b80b24f82
|
[cleaning] country cleaning must use both PID and AlternateIdentifier fields
|
2022-12-15 14:49:04 +01:00 |
Claudio Atzori
|
b8bafab8a0
|
[cleaning] improved vocabulary based mapping, specialization for the strict vocab cleaning
|
2022-12-12 14:43:03 +01:00 |
Sandro La Bruzzo
|
5e4866d033
|
implemented synch for single mdstore
|
2022-12-12 11:29:46 +01:00 |
Claudio Atzori
|
c18b8048c3
|
[cleaning] avoid NPE
|
2022-12-10 11:41:38 +01:00 |
Claudio Atzori
|
8b44afe5e5
|
[cleaning] avoid NPE
|
2022-12-09 15:44:57 +01:00 |
Claudio Atzori
|
389dd25430
|
[cleaning] avoid NPE
|
2022-12-08 18:40:48 +01:00 |
Claudio Atzori
|
730228d73d
|
[cleaning] align wf parameter names in test
|
2022-12-08 18:40:22 +01:00 |
Claudio Atzori
|
2094fa6db0
|
[cleaning] align wf parameter names
|
2022-12-08 17:22:26 +01:00 |
Miriam Baglioni
|
a485a94956
|
[Cleaning] fixed parameter name in property file
|
2022-12-08 16:59:34 +01:00 |
Miriam Baglioni
|
3d99b78d94
|
[Cleaning] fixed error in parameter (workingPath to workingDir)
|
2022-12-08 10:25:02 +01:00 |
Claudio Atzori
|
08c4588d47
|
Merge pull request 'Changes from beta stats wf to prod' (#264) from antonis.lempesis/dnet-hadoop:beta into master
Reviewed-on: D-Net/dnet-hadoop#264
|
2022-12-07 15:56:22 +01:00 |
Claudio Atzori
|
1b8488976b
|
code formatting
|
2022-12-07 10:45:38 +01:00 |
Claudio Atzori
|
cd1b58483e
|
[bulk tag] fixed Community configuration parsing to void NPE
|
2022-12-07 10:39:00 +01:00 |
Claudio Atzori
|
062abfd669
|
fixed NPE, removed unused stuff
|
2022-12-06 12:04:00 +01:00 |
dimitrispie
|
2a52a42169
|
Added 4 institutions:
-University of Modena and Reggio Emilia
-Bilkent University
-Saints Cyril and Methodius University of Skopje
-University of Milan
|
2022-12-06 10:10:21 +02:00 |
Claudio Atzori
|
8248da40d9
|
Merge branch 'beta' into graph_cleaning
|
2022-12-02 14:49:00 +01:00 |
Claudio Atzori
|
ddf065756f
|
Merge pull request 'Two organizations are added for monitor' (#258) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#258
|
2022-12-02 14:45:27 +01:00 |
Sandro La Bruzzo
|
5a48a2fb18
|
implemented synch for single mdstore
|
2022-12-01 11:34:43 +01:00 |
Claudio Atzori
|
a38116546d
|
Merge branch 'beta' into deduptesting
|
2022-11-30 11:27:29 +01:00 |
Miriam Baglioni
|
ce020f2c83
|
[EOSC FUTURE] added resources and test for review
|
2022-11-30 09:57:30 +01:00 |
Miriam Baglioni
|
bb0ddc1c44
|
[BulkTag] adding verb starts_with
|
2022-11-30 09:56:24 +01:00 |
Claudio Atzori
|
8e3edba318
|
[graph cleaning] testing the collectedfron and hostedby patch procedure
|
2022-11-29 16:07:09 +01:00 |
Claudio Atzori
|
58c05731f9
|
[graph cleaning] WIP: testing the collectedfron and hostedby patch procedure
|
2022-11-29 11:21:51 +01:00 |
Miriam Baglioni
|
9c70c5dbd6
|
[Bulk Tag horizontal] added new path in definition of constraint (to recognize fos subjects) - changed test and resource class to test this new aspect
|
2022-11-28 14:51:20 +01:00 |
Miriam Baglioni
|
0628df7a3a
|
resolving conflicts
|
2022-11-28 10:44:56 +01:00 |
Claudio Atzori
|
11695ba649
|
[graph cleaning] patch also the result's collectedfrom and hostedby datasource name according to the datasource master-duplicate mapping
|
2022-11-28 10:18:43 +01:00 |
Claudio Atzori
|
6082d235d3
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into graph_cleaning
|
2022-11-28 09:54:48 +01:00 |
Claudio Atzori
|
24ef301cc1
|
[graph cleaning] patch the result's collectedfrom and hostedby identifiers according to the datasource master-duplicate mapping
|
2022-11-28 09:54:18 +01:00 |
Miriam Baglioni
|
29d3da85f1
|
[EOSC DUMP] added resources needed for the review as test
|
2022-11-25 17:16:20 +01:00 |
Alessia Bardi
|
90c8f9cb61
|
tests for EOSC Future
|
2022-11-23 12:18:44 +01:00 |
Miriam Baglioni
|
33a2b1b5dc
|
[Bulk Tag] fixed typo in test configuration
|
2022-11-23 11:31:17 +01:00 |
Miriam Baglioni
|
c6df8327b3
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2022-11-23 11:26:57 +01:00 |
Miriam Baglioni
|
0e3edc5018
|
[Bulk Tag] fixed issue in verb name
|
2022-11-23 11:26:36 +01:00 |
Miriam Baglioni
|
935aa367d8
|
[BulkTag] removed commented code
|
2022-11-23 11:16:39 +01:00 |
Miriam Baglioni
|
43aedbdfe5
|
[BulkTag] changed verb name in configuration
|
2022-11-23 11:14:23 +01:00 |
Miriam Baglioni
|
b6da9b67ff
|
[BulkTag] fixed typo in annotation for verb name
|
2022-11-23 11:13:58 +01:00 |
Claudio Atzori
|
a79c47522d
|
updated ORCID datasource identifier
|
2022-11-23 10:17:49 +01:00 |
Alessia Bardi
|
2832117f23
|
added eoscifguidelines in test
|
2022-11-22 18:01:12 +01:00 |
Alessia Bardi
|
3c08269a4d
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-11-22 17:31:00 +01:00 |
Alessia Bardi
|
2687fc9f73
|
tests for EOSC Future review - ROhub
|
2022-11-22 17:30:56 +01:00 |
Claudio Atzori
|
a34c8b6f81
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2022-11-22 10:22:31 +01:00 |
Claudio Atzori
|
1d5143b0b6
|
Merge branch 'beta' into deduptesting
|
2022-11-22 10:21:30 +01:00 |
Miriam Baglioni
|
122e75aa17
|
fixed conflicts
|
2022-11-21 18:13:12 +01:00 |
Miriam Baglioni
|
cee7a45b1d
|
[Bulk Tag Datasource] fixed issue with verb name and add new test for neanias selection for orcid
|
2022-11-21 18:10:20 +01:00 |
Claudio Atzori
|
ed64618235
|
increased spark.sql.shuffle.partitions in the last join phase of the result (publication) to community through semantic relation propagation
|
2022-11-18 16:06:51 +01:00 |
Claudio Atzori
|
8742934843
|
added spark.sql.shuffle.partitions in the last join phase of the result to community through semantic relation propagation
|
2022-11-18 11:32:22 +01:00 |
Claudio Atzori
|
0aa725083f
|
extended dedup testing
|
2022-11-17 16:13:43 +01:00 |
Claudio Atzori
|
3dbc637d3e
|
code formatting
|
2022-11-17 09:55:41 +01:00 |
Claudio Atzori
|
13cc592f39
|
code formatting
|
2022-11-15 09:37:57 +01:00 |
Claudio Atzori
|
af15b1e48d
|
[eosc tag] extending criteria for Jupyter Notebook (adding to ORP the same constraint)
|
2022-11-14 18:30:43 +01:00 |
Claudio Atzori
|
eb45ba7af0
|
extended mapping from ODF relations (PR#251)
|
2022-11-14 18:26:13 +01:00 |
Claudio Atzori
|
a929dc5fee
|
integrated changes for mapping ROHub contents in the Graph
|
2022-11-14 18:15:35 +01:00 |
Claudio Atzori
|
ddff0e8999
|
merging duplicates using IdentifierComparator
|
2022-11-11 16:10:25 +01:00 |
Miriam Baglioni
|
5f9383b2d9
|
[EOSC TAG] remove reduntant check for jupyter notebook
|
2022-11-11 14:06:19 +01:00 |
Miriam Baglioni
|
b18bbca8af
|
[EOSC TAG] adding search in orp for jupyter notebook criteria
|
2022-11-11 12:42:58 +01:00 |
Claudio Atzori
|
5af5a8ae42
|
added IdentifierComparator
|
2022-11-09 14:20:59 +01:00 |
Claudio Atzori
|
7c3390ac10
|
Merge branch 'beta' into eoscifguidelines-from-mdstores
|
2022-11-07 12:18:40 +01:00 |
dimitrispie
|
55fa3b2a17
|
Hive memory parameters
|
2022-11-03 15:21:04 +01:00 |
dimitrispie
|
992fc5b628
|
Added McMaster University Institution
|
2022-11-03 11:02:18 +02:00 |
dimitrispie
|
7fda05e380
|
Added Autonomous University of Barcelona
|
2022-11-01 13:59:40 +02:00 |
Claudio Atzori
|
22873c9172
|
Merge pull request 'Added fields: totalcost, fundedamount, currency, in project table' (#257) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#257
|
2022-10-31 13:49:27 +01:00 |
dimitrispie
|
7861c472e0
|
Hive memory parameters
|
2022-10-28 19:00:32 +03:00 |
dimitrispie
|
5df9c63963
|
Added fields: totalcost, fundedamount, currency, in project table
|
2022-10-27 16:44:26 +03:00 |
Sandro La Bruzzo
|
2b9a20a4a3
|
Changed the way Scholexplorer filter the relationships, I found that filter all relation coming from openCitation is wrong, because we loose a lot of relation than intersect OpenCitation, but they don't come only from there
|
2022-10-24 12:53:47 +02:00 |
Alessia Bardi
|
208ed32315
|
fixed xpath for semantic relation
|
2022-10-23 18:18:13 +02:00 |
Alessia Bardi
|
ee759ac92d
|
file format after mvn compile
|
2022-10-23 18:09:47 +02:00 |
Alessia Bardi
|
31a10f000b
|
Map the field oaf:eoscifguidelines from mdstores. Currently we can find it in ROHub metadata
|
2022-10-23 18:05:37 +02:00 |
Claudio Atzori
|
ec39b84898
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-10-19 15:21:02 +02:00 |
Claudio Atzori
|
bca4a61710
|
suppressing hyper verbose spark logs during unit test execution
|
2022-10-19 15:20:58 +02:00 |
Sandro La Bruzzo
|
72f0d88d6c
|
formatted code
|
2022-10-19 14:18:42 +02:00 |
Claudio Atzori
|
9b449110c6
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-10-14 15:48:04 +02:00 |
Claudio Atzori
|
ae7cd0735a
|
[graph2hive] more partitions
|
2022-10-14 15:47:58 +02:00 |
Sandro La Bruzzo
|
135cf81151
|
Merge remote-tracking branch 'origin/beta' into beta
|
2022-10-13 11:47:25 +02:00 |
Sandro La Bruzzo
|
a1f94530a3
|
added documentation
|
2022-10-13 11:47:11 +02:00 |
Claudio Atzori
|
b47aaf4dd1
|
[cleaning] subjects declared as belonging to specific vocabularies whose values are not found in the vocab are set to type keyword
|
2022-10-13 11:23:43 +02:00 |
Claudio Atzori
|
6163ecbf63
|
[cleaning] renamed parameters in wf action
|
2022-10-11 11:20:03 +02:00 |
Claudio Atzori
|
b301e9fdff
|
[cleaning] renamed action name/description
|
2022-10-11 11:08:52 +02:00 |
Claudio Atzori
|
ece40adc09
|
[cleaning] fixing NPE in the country cleaning phase
|
2022-10-11 10:10:20 +02:00 |
Claudio Atzori
|
d51275a965
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-10-07 09:52:49 +02:00 |
Claudio Atzori
|
8d97949316
|
[cleaning] fixed loop in wf nodes
|
2022-10-07 09:52:45 +02:00 |
Miriam Baglioni
|
a653e1b3ea
|
[Enrichment - result to community through organization] reimplementation of the data preparation step using spark
|
2022-10-04 15:01:28 +02:00 |
Miriam Baglioni
|
4d8339614b
|
Revert "[BipFinder] Fixed issue for wrong escaped char in doi"
This reverts commit 188f25eefa .
|
2022-10-04 14:29:47 +02:00 |
Miriam Baglioni
|
7324853a17
|
Revert "[BipFinder] refactoring"
This reverts commit 28dc317350 .
|
2022-10-04 14:29:39 +02:00 |
Miriam Baglioni
|
28dc317350
|
[BipFinder] refactoring
|
2022-10-04 09:47:27 +02:00 |
Miriam Baglioni
|
188f25eefa
|
[BipFinder] Fixed issue for wrong escaped char in doi
|
2022-10-03 12:42:52 +02:00 |
Claudio Atzori
|
89f7007080
|
Merge pull request '[stats wf] misc changes' (#254) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#254
|
2022-10-03 10:32:05 +02:00 |
dimitrispie
|
2c0c3f1806
|
Cast amount to float for table result_apcs
|
2022-09-28 19:33:24 +03:00 |
Alessia Bardi
|
49360770d7
|
map w3id as instance url
|
2022-09-28 14:16:39 +02:00 |
dimitrispie
|
bdc46e3eaa
|
Remove denormalization of results to fix downloads numbers in monitor
|
2022-09-28 14:59:08 +03:00 |
dimitrispie
|
2ebb1459a9
|
Fixed type in no_downloads
|
2022-09-28 14:36:57 +03:00 |
Miriam Baglioni
|
b5b5a4c192
|
[CleanCountry] fixed issue
|
2022-09-28 12:42:51 +02:00 |
Miriam Baglioni
|
f1d7d45cf7
|
[BulkTag] fixed issue
|
2022-09-28 12:01:43 +02:00 |
Miriam Baglioni
|
3ec044600d
|
[BulkTag] fixed conflicts
|
2022-09-28 11:58:28 +02:00 |
Miriam Baglioni
|
1cb79719a7
|
[BulkTag] fixed issues
|
2022-09-28 11:44:55 +02:00 |
Claudio Atzori
|
f3f7604e6c
|
trying to fix a test that fails only on Jenkins
|
2022-09-27 15:21:37 +02:00 |
Claudio Atzori
|
3f90d159e3
|
code formatting
|
2022-09-27 15:08:00 +02:00 |
Claudio Atzori
|
0b3e44e521
|
Merge branch 'beta' into relation-from-odf
|
2022-09-27 14:57:01 +02:00 |
Claudio Atzori
|
57dbeb08d2
|
code formatting
|
2022-09-27 14:55:10 +02:00 |
Claudio Atzori
|
b60985cf68
|
Merge branch 'beta' into horizontalConstraints
|
2022-09-27 14:39:31 +02:00 |
Claudio Atzori
|
3b60642ef9
|
Merge pull request 'Synchronize indicators in stats-db with monitor-db' (#249) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#249
|
2022-09-27 14:37:33 +02:00 |
Claudio Atzori
|
25e9d92aad
|
Merge branch 'beta' into clean_country
|
2022-09-27 14:27:49 +02:00 |
Claudio Atzori
|
80c5e0f637
|
code formatting
|
2022-09-27 12:51:51 +02:00 |
Alessia Bardi
|
fd63e9bfac
|
Mapping all relationships supported in ModelConstants and ModelSupport
|
2022-09-26 11:24:13 +02:00 |
Miriam Baglioni
|
ca216a92ad
|
[BulkTagging] changed the query to the IS to insert values for FOS and SDG as subject in the configuration used for the tagging
|
2022-09-23 17:06:07 +02:00 |
Miriam Baglioni
|
3e6b0f58bb
|
[BulkTagging] changed the query to the IS to get also the information for the advancedConstraint from the profile
|
2022-09-23 16:47:19 +02:00 |
Miriam Baglioni
|
4a3e119b73
|
mergin with branch beta
|
2022-09-23 16:16:06 +02:00 |
Miriam Baglioni
|
f0e303abf9
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-09-23 16:15:32 +02:00 |
Miriam Baglioni
|
55da4d8715
|
[BulkTagging] modifying code to represent constraints horizontally on all the results. Added subject to the set of field used to express the constraint. Modified resorces to test the new approach. Modified test calss
|
2022-09-23 16:02:19 +02:00 |
Alessia Bardi
|
c5eb722170
|
relationships from relatedIdentifier whose target id type is one of the pid type with an authority
|
2022-09-23 15:47:05 +02:00 |
Claudio Atzori
|
c86cc53520
|
suppressing hyper verbose spark logs during unit test execution
|
2022-09-23 15:20:40 +02:00 |
Claudio Atzori
|
c01d528ab2
|
suppressing hyper verbose spark logs during unit test execution
|
2022-09-23 15:19:50 +02:00 |
Alessia Bardi
|
ba33ff71fd
|
refactoring for the generation of relationships from related identifier of type 'OPENAIRE'
|
2022-09-23 15:17:13 +02:00 |
Claudio Atzori
|
e6d788d27a
|
[stats wf] adding missing changes lost in PR#248
|
2022-09-23 14:38:42 +02:00 |
Alessia Bardi
|
982bcc1e35
|
test wrid pid and record identifier
|
2022-09-23 12:06:06 +02:00 |
Miriam Baglioni
|
960cb861a0
|
refactoring
|
2022-09-23 11:14:04 +02:00 |
Claudio Atzori
|
930f118673
|
fixed semantic (subreltype) for ServiceOrganization relations
|
2022-09-22 16:24:44 +02:00 |
Claudio Atzori
|
c42850328e
|
fixed semantic (subreltype) for ServiceOrganization relations
|
2022-09-22 16:23:25 +02:00 |
Miriam Baglioni
|
33bb79459e
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-09-22 15:55:17 +02:00 |
Claudio Atzori
|
10ec074f79
|
Merge remote-tracking branch 'antonis.lempesis/beta' into beta2master_sept_2022
|
2022-09-22 14:12:19 +02:00 |
dimitrispie
|
dcd85f8cd7
|
- Synchronize indicators in stats-db with monitor-db
- added new openorg id for Nanyang Technological University
- changed openorg id for University of Helsinki #8088 ticket
|
2022-09-22 13:33:07 +03:00 |
Claudio Atzori
|
7225fe9cbe
|
integrated changes from discard-non-wellformed
|
2022-09-22 10:06:07 +02:00 |
Miriam Baglioni
|
869e129288
|
[EOSC BulkTag] refactoring
|
2022-09-20 16:13:18 +02:00 |
Miriam Baglioni
|
840465958b
|
[EOSC BulkTag] filtering aout the datasources registered in the eosc with compatibility different from 3.0, 4.0 for literature, data and CRIS to add the context eosc to the results
|
2022-09-20 10:30:41 +02:00 |
Claudio Atzori
|
bdc8f993d0
|
[Patch Hosted By] check also the presence of datasource.officialname.value
|
2022-09-19 15:28:03 +02:00 |
Miriam Baglioni
|
ec87149cb3
|
[Patch Hosted By] added fix to avoi NPE error when datasource official name is not provided. Removing datasources if no officialname has been provided
|
2022-09-19 14:06:52 +02:00 |
Miriam Baglioni
|
b42e2c9df6
|
[Patch Hosted By] added fix to avoi NPE error when datasource official name is not provided
|
2022-09-19 12:30:32 +02:00 |
Miriam Baglioni
|
1329aa8479
|
[EOSC BulkTag] modified test to remove association of result to eosc when eoscifguidelines are set
|
2022-09-19 11:59:48 +02:00 |
Miriam Baglioni
|
a0ee1a8640
|
[EOSC BulkTag] remove addition of eosc context for result with eosc if guidelines set
|
2022-09-19 11:44:10 +02:00 |
Claudio Atzori
|
e45ec15221
|
Merge branch 'beta' into clean_country
|
2022-09-19 11:34:02 +02:00 |
Claudio Atzori
|
26e1badded
|
added instance.url syntactical validation, avoid creating multiple duplicated URLs
|
2022-09-19 11:19:10 +02:00 |
Miriam Baglioni
|
5240ac3d7b
|
[EOSC Tag] remove addition of eosc context for result with eosc if guidelines set
|
2022-09-19 11:02:18 +02:00 |
Claudio Atzori
|
192215a18e
|
merged from branch discard-non-wellformed
|
2022-09-19 10:17:10 +02:00 |
Claudio Atzori
|
fd87571506
|
code formatting
|
2022-09-16 16:05:03 +02:00 |
Claudio Atzori
|
d72a64ded3
|
Merge commit '690be4482fc84327dc7617acbc8d976d559df512' into beta2master_sept_2022
|
2022-09-16 15:57:44 +02:00 |
Claudio Atzori
|
dbb567251a
|
merged 853c996fa2
|
2022-09-16 15:56:28 +02:00 |
Claudio Atzori
|
0849ebfd80
|
merged a11eb38065
|
2022-09-16 15:54:32 +02:00 |
Claudio Atzori
|
45fc5e12be
|
Merge commit 'cb7c07c54e59675e8dffe42b7f2a13f16c956068' into beta2master_sept_2022
|
2022-09-16 15:48:55 +02:00 |
Claudio Atzori
|
01d5ad6361
|
Merge commit 'd85ba3c1a9d7f0e80565742161ff6c9ecffd52b7' into beta2master_sept_2022
|
2022-09-16 15:48:16 +02:00 |
Claudio Atzori
|
ab0efecab4
|
Merge commit '84598c75356cf580de6c81653a9351e9b8173639' into beta2master_sept_2022
|
2022-09-16 15:47:05 +02:00 |
Claudio Atzori
|
0ec2eaba35
|
Merge commit 'c1f2ffc53dc41f1fac3855b2d2df7d6a5ea15e3e' into beta2master_sept_2022
|
2022-09-16 15:45:27 +02:00 |
Claudio Atzori
|
2abe2bc137
|
Merge commit '08ce2cadc2d84aa982726e429c280a905536a715' into beta2master_sept_2022
|
2022-09-16 15:43:49 +02:00 |
Claudio Atzori
|
cbd48bc645
|
Merge commit 'efd96e7e664e4139321e35e8d172b884ba4b61a1' into beta2master_sept_2022
|
2022-09-16 15:38:56 +02:00 |
Claudio Atzori
|
e370e940d8
|
[aggregator graph] save invalid records aside for further inspection
|
2022-09-16 14:06:28 +02:00 |
Claudio Atzori
|
465e941214
|
Merge pull request '[stats wf] Changes to indicators tables' (#244) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#244
|
2022-09-16 10:13:58 +02:00 |
Claudio Atzori
|
1e42d984e1
|
[aggregator graph] save invalid records aside for further inspection
|
2022-09-15 10:49:42 +02:00 |
Alessia Bardi
|
9e7ec4198f
|
fixed test
|
2022-09-14 18:08:56 +02:00 |
Claudio Atzori
|
c48f6e9c57
|
[aggregator graph] save invalid records aside for further inspection
|
2022-09-14 17:11:26 +02:00 |
dimitrispie
|
3bf3127251
|
Changes to monitor and indicator scripts
|
2022-09-14 16:36:19 +03:00 |
Claudio Atzori
|
a0919ed495
|
[aggregator graph] save invalid records aside for further inspection
|
2022-09-14 13:27:39 +02:00 |
Alessia Bardi
|
b99a011345
|
return empty Oaf list if record cannot be parsed
|
2022-09-13 11:51:55 +02:00 |
Alessia Bardi
|
27af5122d2
|
logs for non well formed XML files
|
2022-09-12 14:25:23 +02:00 |
Claudio Atzori
|
ff6f789b6d
|
code formatting
|
2022-09-09 15:16:31 +02:00 |
Claudio Atzori
|
b5d6966c01
|
Merge branch 'beta' into clean_country
|
2022-09-09 12:20:19 +02:00 |
Claudio Atzori
|
b5f7bd30be
|
Merge branch 'beta' into clean_subjects
|
2022-09-09 12:20:04 +02:00 |
Alessia Bardi
|
f14107ad77
|
Merge branch 'handle_as_instance_urls' of https://code-repo.d4science.org/D-Net/dnet-hadoop into handle_as_instance_urls
|
2022-09-09 12:17:19 +02:00 |
Alessia Bardi
|
a539c6ccaf
|
https for handle URLs
|
2022-09-09 12:16:28 +02:00 |
dimitrispie
|
71b069ca90
|
Changes to indicator and monitor scripts
|
2022-09-09 13:15:58 +03:00 |
Claudio Atzori
|
1203378441
|
Merge branch 'beta' into clean_subjects
|
2022-09-09 10:38:47 +02:00 |
Claudio Atzori
|
14dc909a14
|
Merge branch 'beta' into clean_country
|
2022-09-09 10:38:17 +02:00 |
Claudio Atzori
|
853c996fa2
|
Merge branch 'beta' into handle_as_instance_urls
|
2022-09-09 09:47:16 +02:00 |
Claudio Atzori
|
a431e01383
|
Merge pull request 'orcid_multipleworks_download' (#242) from enrico.ottonello/dnet-hadoop:orcid_multipleworks_download into beta
Reviewed-on: D-Net/dnet-hadoop#242
|
2022-09-09 08:45:02 +02:00 |
Alessia Bardi
|
9ef063d502
|
#7861#note-8 instance url from handle
|
2022-09-07 17:29:54 +03:00 |
Alessia Bardi
|
5c45d52af3
|
testing for RiuNet
|
2022-09-07 15:40:57 +03:00 |
dimitrispie
|
2b5f8c9c9a
|
comment out duplicate table creation
|
2022-09-06 12:27:53 +03:00 |
Alessia Bardi
|
a11eb38065
|
testing for RO-Hub
|
2022-09-02 16:07:36 +02:00 |
Enrico Ottonello
|
bfdf2dc390
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid_multipleworks_download
|
2022-08-25 12:07:54 +02:00 |
Enrico Ottonello
|
da1cf561e6
|
alignment with beta
|
2022-08-25 11:57:20 +02:00 |
Enrico Ottonello
|
27445ccdaa
|
cleaned log
|
2022-08-25 11:56:14 +02:00 |
Claudio Atzori
|
b7c387c21f
|
cleaning of subjects: avoid duplicated subjects, prioritise collected vs inferred or other sources
|
2022-08-12 15:09:16 +02:00 |
Claudio Atzori
|
adb526b0e1
|
Merge branch 'beta' into clean_subjects
|
2022-08-12 10:51:17 +02:00 |
Claudio Atzori
|
cb7c07c54e
|
[scholix] added step to create tar archive
|
2022-08-11 11:25:24 +02:00 |
Claudio Atzori
|
2aa16d0432
|
[scholix] fixed OpenCitation dump procedure
|
2022-08-10 17:39:29 +02:00 |
Miriam Baglioni
|
7dbdd4a0fe
|
[Clean Country]changes related to D-Net/dnet-hadoop#241 (comment)
|
2022-08-10 15:13:10 +02:00 |
Claudio Atzori
|
51ad93e545
|
[scholix] fixed OpenCitation dump procedure
|
2022-08-10 11:57:56 +02:00 |
Miriam Baglioni
|
62d2138806
|
[Clean Context] changed a bit the logic. Added the check not to have result hosted by a datasource of type institutional repository from NL. Added also the check that the country should have been included in the result via propagation for it to be removed
|
2022-08-08 14:10:47 +02:00 |
Claudio Atzori
|
3418ce50ac
|
cleaning of subjects: perform the cleaning when the given value is equivalent to one of the terms in the vocabulary
|
2022-08-08 12:48:47 +02:00 |
Claudio Atzori
|
a78028dabc
|
Merge branch 'beta' into clean_subjects
|
2022-08-08 12:34:33 +02:00 |
Miriam Baglioni
|
390013a4b2
|
mergin with branch beta
|
2022-08-08 12:30:31 +02:00 |
Claudio Atzori
|
3937ff04de
|
Merge branch 'beta' into tagEosc
|
2022-08-08 09:57:23 +02:00 |
Claudio Atzori
|
a4815f6bec
|
Merge branch 'beta' into clean_subjects
|
2022-08-05 16:57:03 +02:00 |
Claudio Atzori
|
29c4cde42e
|
Merge branch 'clean_subjects' of https://code-repo.d4science.org/D-Net/dnet-hadoop into clean_subjects
|
2022-08-05 16:56:37 +02:00 |
Claudio Atzori
|
4eaa063b1f
|
cleaning of subjects
|
2022-08-05 16:56:09 +02:00 |
Claudio Atzori
|
84598c7535
|
Merge pull request 'restored some collab indicators' (#240) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#240
|
2022-08-05 15:50:39 +02:00 |
Antonis Lempesis
|
fcef5294e2
|
restored some collab indicators
|
2022-08-05 13:45:01 +03:00 |
Claudio Atzori
|
844f6eb465
|
Merge branch 'beta' into clean_subjects
|
2022-08-05 12:39:05 +02:00 |
Claudio Atzori
|
32cee1f619
|
WIP: cleaning of subjects
|
2022-08-05 12:32:08 +02:00 |
Claudio Atzori
|
c1f2ffc53d
|
Merge pull request 'commenting out the collab indicators because they still fail' (#237) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#237
|
2022-08-05 11:57:36 +02:00 |
Antonis Lempesis
|
227e10f4b3
|
commenting out the collab indicators because they still fail
|
2022-08-05 12:54:36 +03:00 |
Claudio Atzori
|
6c0fd9284b
|
merge from beta
|
2022-08-05 10:42:53 +02:00 |
Claudio Atzori
|
b78889a0ce
|
WIP: cleaning of subjects
|
2022-08-05 09:11:37 +02:00 |
Miriam Baglioni
|
a7a18d7630
|
[Graph Dump] removed code for the dump from the project. Fixed issues in tests when possible
|
2022-08-04 17:40:40 +02:00 |
Claudio Atzori
|
499826ead1
|
serialising field eoscifguidelines field in the Solr XML records
|
2022-08-04 12:40:48 +02:00 |
Claudio Atzori
|
27a91841e7
|
WIP: cleaning of subjects
|
2022-08-04 11:39:39 +02:00 |
Antonis Lempesis
|
b09d7ddc74
|
fixed the datasourceOrganization relations
|
2022-08-03 12:26:50 +02:00 |
Claudio Atzori
|
e62018e95d
|
[aggregator graph] added more assertions in test
|
2022-08-03 12:26:05 +02:00 |
Claudio Atzori
|
efd96e7e66
|
Merge pull request 'fixed the datasourceOrganization relations' (#233) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#233
|
2022-08-03 12:25:05 +02:00 |
Antonis Lempesis
|
8b0407d8ec
|
fixed the datasourceOrganization relations
|
2022-08-03 12:26:59 +03:00 |
Claudio Atzori
|
eb53b52f7c
|
code formatting
|
2022-08-02 13:24:47 +02:00 |
Claudio Atzori
|
27681cf6bf
|
Merge pull request '[stats wf] latest version of indicators + added FOS classification' (#232) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#232
|
2022-08-02 12:57:15 +02:00 |
Antonis Lempesis
|
1778d40c40
|
latest version of indicators
|
2022-08-02 13:39:34 +03:00 |
Claudio Atzori
|
209c7e9dab
|
[datacite] avoid UnsupportedOperationException
|
2022-08-01 09:05:35 +02:00 |
Enrico Ottonello
|
64311b8be4
|
removed unuseful accumulator
|
2022-07-31 01:03:29 +02:00 |
Antonis Lempesis
|
6fc9ef53f6
|
addded command line params to allow hive actions to run
|
2022-07-29 16:36:20 +03:00 |
Antonis Lempesis
|
9886fe87ec
|
- Added FOS classification
- Added extra orgs in monitor
- Fixed result-project and organization-project tables
|
2022-07-29 16:34:50 +03:00 |
Claudio Atzori
|
92e48f12f7
|
[metadata collection] updated collector plugin name
|
2022-07-29 13:54:00 +02:00 |
Claudio Atzori
|
f62c4e05cd
|
code formatting
|
2022-07-29 11:56:01 +02:00 |
Claudio Atzori
|
0727f0ef48
|
[EOSC tag] avoid NPEs
|
2022-07-29 11:55:34 +02:00 |
Miriam Baglioni
|
3329b6ce6b
|
[EOSC TAG] added fix for NPE on subjects
|
2022-07-29 10:54:20 +02:00 |
Claudio Atzori
|
1dd1e4fe3a
|
extended test for mapping project_organization relations
|
2022-07-28 11:27:08 +02:00 |
Claudio Atzori
|
60e4fbd78b
|
Merge branch 'beta' into project_organization_contribution
|
2022-07-28 10:15:43 +02:00 |
Claudio Atzori
|
ed98a6d9d0
|
[Datacite mapping] include the older datacite prefixed OpenAIRE id among the originalId[]
|
2022-07-28 10:15:14 +02:00 |
Claudio Atzori
|
09ccc7b472
|
Merge branch 'beta' into project_organization_contribution
|
2022-07-28 09:49:59 +02:00 |
Sandro La Bruzzo
|
67525076ec
|
fixed test, now it compiles after commit a6977197b3
|
2022-07-26 15:35:17 +02:00 |
Claudio Atzori
|
26104826c4
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-07-26 14:34:29 +02:00 |
Claudio Atzori
|
d43663d30f
|
adapted RorActionSet test, it should not create parent/child rels
|
2022-07-25 17:54:10 +02:00 |
Miriam Baglioni
|
35bcd9422d
|
[EOSC Context Tagging] removed not needed specification in path
|
2022-07-25 15:45:22 +02:00 |
Miriam Baglioni
|
1c82acb168
|
[EOSC Context Tagging] refactoring: moved EOSC IF tagging in package eosc under bulkTag
|
2022-07-25 14:26:39 +02:00 |
Miriam Baglioni
|
68cb637832
|
merge with branch beta
|
2022-07-25 14:24:25 +02:00 |
Miriam Baglioni
|
0172bab251
|
[EOSC Context Tagging] refactoring
|
2022-07-25 14:16:45 +02:00 |
Claudio Atzori
|
612b7a5530
|
Merge branch 'beta' into tagEosc
|
2022-07-25 14:12:59 +02:00 |
Claudio Atzori
|
c3ede1b379
|
Merge branch 'beta' into pubmed_update
|
2022-07-25 14:10:22 +02:00 |
Miriam Baglioni
|
144c103b67
|
[EOSC Context Tagging] add check to avoid the insertion of the context if already present
|
2022-07-25 13:52:45 +02:00 |
Enrico Ottonello
|
657b0208a2
|
multiple works download (<=100) for single request
|
2022-07-25 12:37:39 +02:00 |
Miriam Baglioni
|
d091866e48
|
[EOSC Context Tagging] refactoring
|
2022-07-25 11:12:22 +02:00 |
Miriam Baglioni
|
5968ec018d
|
[Clean Country] modified workflow and added param file
|
2022-07-22 16:48:38 +02:00 |
Miriam Baglioni
|
a12d28c644
|
[Clean Country] added logic not to remove country from result if it exist a hosting datasource with that country. Moreover the country will be removed only if added with propagation
|
2022-07-22 16:23:12 +02:00 |
Miriam Baglioni
|
2c933f1158
|
mergin with branch beta
|
2022-07-22 14:57:41 +02:00 |
Miriam Baglioni
|
06a95daf60
|
[EOSC context TAG] refactoring after compilation
|
2022-07-22 14:57:06 +02:00 |
Miriam Baglioni
|
ffb0ce3fb9
|
mergin with branch beta
|
2022-07-22 14:55:55 +02:00 |
Miriam Baglioni
|
627332526b
|
[EOSC context TAG] workflow start from reset_outputpath action
|
2022-07-22 14:55:11 +02:00 |
Miriam Baglioni
|
7a1c1b6f53
|
[EOSC context TAG] Add test class and resourcesK
|
2022-07-22 14:36:02 +02:00 |
Sandro La Bruzzo
|
ddc414b258
|
fixed wrong json param
|
2022-07-22 09:43:15 +02:00 |
Miriam Baglioni
|
317a4a56ef
|
[EOSC context TAG] first implementation of the logic to tag results imported from datasources registered in the EOSC
|
2022-07-21 17:37:48 +02:00 |
Miriam Baglioni
|
3be036f290
|
[EOSC TAG] refactoring after compilation
|
2022-07-21 14:45:43 +02:00 |
Miriam Baglioni
|
e61b8e6b03
|
mergin with branch beta
|
2022-07-21 14:43:23 +02:00 |
Miriam Baglioni
|
56d09e6348
|
[EOSC TAG] before adding the tag added a step to verify the same tag is not already present
|
2022-07-21 14:36:48 +02:00 |
Miriam Baglioni
|
5143a80232
|
[EOSC TAG] modification of test class to align with new element
|
2022-07-21 11:56:51 +02:00 |
Sandro La Bruzzo
|
5f651f2316
|
changed filter relation on SubRelType
|
2022-07-21 10:11:48 +02:00 |
Miriam Baglioni
|
438abdf96f
|
[EOSC TAG] adding eosc interoperability guidelines in the specific element in the result. Removed from subjects. Removed also the deletion of EOSC Jupyter Notebook from subject since now the criteria are searchd for in a different place
|
2022-07-20 18:07:54 +02:00 |
Miriam Baglioni
|
65cc736e2f
|
[Clean Country] first implementation to remove country NL from results collected from NARCIS when doi starts with mendely prefix
|
2022-07-20 17:05:56 +02:00 |
Sandro La Bruzzo
|
5b76321d9c
|
implemented oozie workflow to generate scholix dump filtering relclass semantic
|
2022-07-20 16:34:32 +02:00 |
Claudio Atzori
|
1138b2ac8e
|
code formatting
|
2022-07-19 14:15:49 +02:00 |
Sandro La Bruzzo
|
00168303db
|
Added unit test to verify the generation in the OriginalID the old openaire Identifier generated by OAI
|
2022-07-14 10:19:59 +02:00 |
Sandro La Bruzzo
|
0a4f4d98fa
|
added PMCId to PmArticle
|
2022-07-13 15:27:17 +02:00 |
Claudio Atzori
|
0c1cfee396
|
mapping oaf:fulltext elements in the result.fulltext field
|
2022-07-11 17:34:59 +02:00 |
Miriam Baglioni
|
fae681fea1
|
[Country Propagation] add check to avoid NPE on datasource.getDatasourceType().getClassis()
|
2022-07-03 17:39:58 +02:00 |
Miriam Baglioni
|
c09fcdb40b
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-07-01 12:38:03 +02:00 |
Claudio Atzori
|
0cb1c70788
|
code formatting
|
2022-07-01 10:44:08 +02:00 |
Claudio Atzori
|
4ec13e2b66
|
Merge branch 'master' into dump_new_funded_products
|
2022-07-01 10:30:28 +02:00 |
Claudio Atzori
|
072f192853
|
include the class information in the measure XML serialization
|
2022-07-01 09:54:56 +02:00 |
Claudio Atzori
|
a88103bcf9
|
[action manager] added more testing
|
2022-07-01 09:06:59 +02:00 |
Claudio Atzori
|
7da24c1dec
|
added more logging
|
2022-06-28 13:47:49 +02:00 |
Miriam Baglioni
|
ee1f1eeca2
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-06-28 11:06:32 +02:00 |
Miriam Baglioni
|
71744a1f52
|
[DUMP DELTA PROJECTS] refactoring
|
2022-06-27 18:07:58 +02:00 |
Miriam Baglioni
|
1d1fe3b151
|
[DUMP DELTA PROJECTS] refactoring
|
2022-06-27 18:04:59 +02:00 |
Claudio Atzori
|
a8773af0cb
|
Merge branch 'beta' into project_organization_contribution
|
2022-06-27 09:37:40 +02:00 |
Claudio Atzori
|
4829b96bb5
|
Merge branch 'beta' into author_name_particles
|
2022-06-27 09:37:03 +02:00 |
Claudio Atzori
|
5130eac247
|
mapping by participant project contribution
|
2022-06-24 17:16:42 +02:00 |
Claudio Atzori
|
929b145130
|
code formatting
|
2022-06-21 23:07:06 +02:00 |
Miriam Baglioni
|
edddfc6c63
|
[DUMP DELTA PROJECTS] adding test and resource
|
2022-06-21 18:28:53 +02:00 |
Miriam Baglioni
|
f561f13dd9
|
[Funder Products Dump] fixed names of parameters in workflow
|
2022-06-21 18:18:17 +02:00 |
Miriam Baglioni
|
ff74e73369
|
[DUMP NEW FUNDED PRODUCTS] change in resources
|
2022-06-21 18:02:51 +02:00 |
Miriam Baglioni
|
b98f904d48
|
[Funder Products Dump] new way to avoid using hive
|
2022-06-21 17:52:27 +02:00 |
Miriam Baglioni
|
7423577a08
|
[Graph DUMP] add code to produce the delta of new projects with respect to the previous delta/dump
|
2022-06-21 14:51:38 +02:00 |
Claudio Atzori
|
b295a40d9c
|
restored use of name_particles when parsing author names
|
2022-06-16 12:20:43 +02:00 |
Claudio Atzori
|
c7b09c6225
|
Merge branch 'beta' into 7096-fileGZip-collector-plugin
|
2022-06-16 09:28:50 +02:00 |
Claudio Atzori
|
e03c0c7794
|
Merge branch 'beta' into oaf_relation_mapping
|
2022-06-16 09:27:01 +02:00 |
Claudio Atzori
|
06b5533d4c
|
Merge branch 'beta' into 7096-fileGZip-collector-plugin
|
2022-06-16 09:22:16 +02:00 |
Claudio Atzori
|
4c8e820ff0
|
mapping relationship from trasformed records based on oaf:relation
|
2022-06-14 08:49:02 +02:00 |
Alessia Bardi
|
88d531dc91
|
exclude FAIRsharing records from Datacite
|
2022-06-13 16:17:17 +02:00 |
Claudio Atzori
|
116902c028
|
mapping relationship from trasformed records based on oaf:relation
|
2022-06-13 14:31:48 +02:00 |
Claudio Atzori
|
b8cda65487
|
code formatting
|
2022-06-13 09:20:03 +02:00 |
Michele Artini
|
634869ce95
|
deleted hierarchical rels from ror action set
|
2022-06-13 09:12:21 +02:00 |
Alessia Bardi
|
922c6d66ef
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-06-10 17:29:15 +02:00 |
Alessia Bardi
|
68bd58d6a4
|
tests for ROHub
|
2022-06-10 17:29:11 +02:00 |
Miriam Baglioni
|
b229c6e7af
|
Merge pull request 'beta' (#218) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#218
|
2022-06-10 11:03:48 +02:00 |
Antonis Lempesis
|
ab18c9daa9
|
Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta
|
2022-06-09 15:48:21 +03:00 |
Antonis Lempesis
|
574492c659
|
removed double result_apc table creation from monitor
|
2022-06-09 15:48:13 +03:00 |
Michele Artini
|
b94a791bc5
|
unit tests to transform cnr explora
|
2022-06-09 12:25:34 +02:00 |
Miriam Baglioni
|
4b6913787b
|
[DOI-BOOST] added one method in test of crossref mapping to aof and one resource. Related to ticket 7807
|
2022-06-08 14:55:19 +02:00 |
Antonis Lempesis
|
db088cc69c
|
fixed *_organization tables
|
2022-06-07 04:04:28 +03:00 |
Miriam Baglioni
|
31d4557e8d
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2022-06-06 11:52:29 +02:00 |
Claudio Atzori
|
5c2949a864
|
Merge pull request '[stats wf] added open citations & more orgs in monitor, removed collab indicator' (#213) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#213
|
2022-05-20 11:38:43 +02:00 |
Miriam Baglioni
|
5e0b8f9b5f
|
[CountryPropagation] refactoring
|
2022-05-20 09:15:53 +02:00 |
Miriam Baglioni
|
c298c148cb
|
[CountryPropagation] fix NPE issue
|
2022-05-20 09:11:46 +02:00 |
Miriam Baglioni
|
eaf9385ae5
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-05-17 15:09:37 +02:00 |
Miriam Baglioni
|
f5207885e3
|
[EOSCTag] changed code to remove EOSC Jupyter Notebook and modified test to exclude galaxy + software from the tagging for Galaxy
|
2022-05-17 15:09:22 +02:00 |
Claudio Atzori
|
d098ad0d93
|
[hb patch] updated map
|
2022-05-16 15:54:04 +02:00 |
Claudio Atzori
|
1dda11e031
|
[hb patch] updated map
|
2022-05-16 15:53:27 +02:00 |
Claudio Atzori
|
8dd5517548
|
code formatting
|
2022-05-16 14:35:24 +02:00 |
Claudio Atzori
|
52cb086506
|
[graph grouping] drop relation target path before copying from source
|
2022-05-16 12:08:36 +02:00 |
Claudio Atzori
|
6442763f97
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-05-16 12:07:45 +02:00 |
Claudio Atzori
|
997c50078e
|
[graph grouping] drop relation target path before copying from source
|
2022-05-16 12:07:40 +02:00 |
Sandro La Bruzzo
|
c1971d52c4
|
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta
|
2022-05-16 10:30:35 +02:00 |
Sandro La Bruzzo
|
4c50f35c8b
|
update publication Date format
|
2022-05-16 10:29:36 +02:00 |
Michele Artini
|
46c07e0724
|
deleted hierarchical rels from ror action set
|
2022-05-16 09:39:54 +02:00 |
Claudio Atzori
|
6031acb2e3
|
[openorgs] fixed parent/child query, using the correct semantic labels
|
2022-05-16 09:20:48 +02:00 |
Claudio Atzori
|
0dc33ea391
|
[openorgs] fixed parent/child query, using the correct semantic labels
|
2022-05-16 09:20:30 +02:00 |
Antonis Lempesis
|
3fc9efeab6
|
fixed typo, addded open citations and apcs in monitor
|
2022-05-13 14:28:13 +03:00 |
Miriam Baglioni
|
e4eac1d20b
|
[EOSC TAG] added code to remove EOSC Jupyter Notebook from subjects and put EOSC as classid in the qualifier
|
2022-05-13 11:01:33 +02:00 |
Sandro La Bruzzo
|
22f65680b9
|
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta
|
2022-05-11 15:30:12 +02:00 |
Sandro La Bruzzo
|
ca8d26bcb4
|
added better filter for openCitations
|
2022-05-11 15:29:57 +02:00 |
Claudio Atzori
|
5d3b4a9c25
|
[graph merge beta] merge datasource originalid, collectedfrom, and pid lists
|
2022-05-11 14:13:06 +02:00 |
Antonis Lempesis
|
23334479bb
|
removed yet another collab, added more orgs in monitor
|
2022-05-11 13:05:52 +03:00 |
Claudio Atzori
|
2a8e0fb72f
|
[openorgs] mapping parent/child relations without massaging the semantic labels
|
2022-05-10 08:45:53 +02:00 |
Claudio Atzori
|
77bc9863e9
|
[openorgs] mapping parent/child relations without massaging the semantic labels
|
2022-05-09 16:06:04 +02:00 |
Claudio Atzori
|
378020e30a
|
[eosc_services] unit test adaptation
|
2022-05-09 16:05:06 +02:00 |
Miriam Baglioni
|
89657a0b78
|
[UsageCount] refactoring
|
2022-05-09 14:43:27 +02:00 |
Miriam Baglioni
|
a056f59c6e
|
[UsageCount] make it as an action set as it should be, plus changed the test to make them work as well now
|
2022-05-09 12:51:35 +02:00 |
Antonis Lempesis
|
61b4c19e65
|
restored indi_result_org_country_collab, removed indi_result_org_collab
|
2022-05-06 12:52:10 +03:00 |
Antonis Lempesis
|
cfbbcaf7c4
|
commented out indi_result_org_country_collab
|
2022-05-06 12:49:36 +03:00 |
Claudio Atzori
|
658450d9a3
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-05-05 11:38:08 +02:00 |
Claudio Atzori
|
846975c886
|
[eosc_services] using the correct 'keyword' subject type, as declared in the dnet:subject_classification_typologies vocabulary
|
2022-05-05 11:37:58 +02:00 |
Miriam Baglioni
|
8a72de4011
|
[EOSCTag] modified workflow to execute all the steps and not only the last one
|
2022-05-04 10:10:56 +02:00 |
Miriam Baglioni
|
bd1108f98b
|
mergin with branch beta
|
2022-05-04 10:06:56 +02:00 |
Miriam Baglioni
|
3aeedd931a
|
[EOSCTag] fixed issue in case description is null. Modified test resources and classes
|
2022-05-04 10:06:38 +02:00 |
Claudio Atzori
|
da611cfbbd
|
[eosc_services] resolved merge conflicts
|
2022-05-03 13:37:15 +02:00 |
Claudio Atzori
|
9e12cb3c92
|
EOSC Services - removed field knowledgegraph; depending on the released schema module
|
2022-05-03 11:55:45 +02:00 |
Miriam Baglioni
|
a21fe310e5
|
[EOSCTag] last test and change in the implementation to search in title and descriptio
|
2022-05-02 17:43:20 +02:00 |
Claudio Atzori
|
2ade69dea6
|
EOSC Services - minor
|
2022-05-02 17:03:31 +02:00 |
Claudio Atzori
|
b6a7ff3a99
|
EOSC Services - removed fields from mapping, testing preparation
|
2022-05-02 15:52:33 +02:00 |
Miriam Baglioni
|
e37177e1ce
|
mergin with branch beta
|
2022-05-02 12:31:50 +02:00 |
Claudio Atzori
|
a8c51f6f16
|
EOSC Services - fixed query and testing preparation
|
2022-05-02 11:09:03 +02:00 |
Claudio Atzori
|
05c1ea92e9
|
EOSC Services - added Service-specific fields in the XML record serialization
|
2022-04-29 15:56:55 +02:00 |
Claudio Atzori
|
f5f532d134
|
EOSC Services - ongoing update
|
2022-04-29 12:25:24 +02:00 |
Antonis Lempesis
|
0353f93d54
|
added new hive opts
|
2022-04-29 12:49:27 +03:00 |
Serafeim Chatzopoulos
|
623f7be26d
|
Fix reading files from HDFS in FileCollector & FileGZipCollector plugins
|
2022-04-28 16:31:11 +03:00 |
Claudio Atzori
|
5ffc24d1ba
|
EOSC Services - ongoing update
|
2022-04-26 16:18:41 +02:00 |
Sandro La Bruzzo
|
78015a5733
|
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta
|
2022-04-26 09:56:34 +02:00 |
Sandro La Bruzzo
|
8c22e5c30a
|
added fix to include date array with only year or year and month
|
2022-04-26 09:56:27 +02:00 |
Claudio Atzori
|
81c4496d32
|
Merge branch 'beta' into 7096-fileGZip-collector-plugin
|
2022-04-26 09:02:15 +02:00 |
Miriam Baglioni
|
e342ec93f0
|
[EOSCTag] prepared resources for test
|
2022-04-22 18:35:37 +02:00 |
Miriam Baglioni
|
88562c0930
|
[EOSC TAG] added test for galaxy for title and description criterias
|
2022-04-22 18:35:03 +02:00 |
Miriam Baglioni
|
dfbd2bcbea
|
[EOSC TAG] added logic in case subject is null
|
2022-04-22 18:34:03 +02:00 |
Miriam Baglioni
|
27c85e901a
|
[EOSCTag] added resources and finalized test for Jupyter Notebook tagging
|
2022-04-22 17:38:10 +02:00 |
Miriam Baglioni
|
87bff36d9e
|
mergin with branch beta
|
2022-04-22 15:52:34 +02:00 |
Miriam Baglioni
|
911ce0780a
|
Merge branch 'cleancontext' of https://code-repo.d4science.org/D-Net/dnet-hadoop into cleancontext
|
2022-04-22 15:41:42 +02:00 |
Miriam Baglioni
|
19d90658fc
|
[Clean Context] added description to parameters
|
2022-04-22 15:41:23 +02:00 |
Claudio Atzori
|
54162f5c4f
|
Merge branch 'beta' into cleancontext
|
2022-04-22 11:49:33 +02:00 |
Miriam Baglioni
|
bbb77052d3
|
[EOSCTag] first test
|
2022-04-22 11:32:57 +02:00 |
Claudio Atzori
|
30105f0722
|
Merge branch 'beta' into 7096-fileGZip-collector-plugin
|
2022-04-22 11:22:21 +02:00 |
Sandro La Bruzzo
|
a82ec3aaaf
|
code formatter
|
2022-04-22 11:08:13 +02:00 |
Sandro La Bruzzo
|
aa12429f50
|
Modified last intersection since we lost many titles.
|
2022-04-22 11:05:08 +02:00 |
Miriam Baglioni
|
7cb7066472
|
[EoscTag] first "rough" implementation
|
2022-04-22 10:44:17 +02:00 |
Sandro La Bruzzo
|
d660895b30
|
fixed wrong mapping type of dataset
|
2022-04-21 20:41:13 +02:00 |
Miriam Baglioni
|
e0915061c2
|
[Clean Context] fixed issue in param name
|
2022-04-21 16:32:40 +02:00 |
Miriam Baglioni
|
6dc68c48e0
|
[EOSCTag] -
|
2022-04-21 16:19:04 +02:00 |
Miriam Baglioni
|
9a961a0092
|
[Clean Context] fixed issue in param name
|
2022-04-21 15:12:24 +02:00 |
Claudio Atzori
|
29150a5d0c
|
code formatting
|
2022-04-21 13:31:56 +02:00 |
Miriam Baglioni
|
5b7d9e741c
|
[Clean Context] added logic to cleaning workflow to accomodate also context cleaning
|
2022-04-21 13:02:14 +02:00 |
Miriam Baglioni
|
ccba1a3db1
|
[Clean Context] added logic to cleaning workflow to accomodate also context cleaning
|
2022-04-21 13:00:06 +02:00 |
Miriam Baglioni
|
20de75ca64
|
[Measures] removed typo
|
2022-04-21 12:14:03 +02:00 |
Miriam Baglioni
|
bebb2a0560
|
Merge branch 'eosc_dimitris' of https://code-repo.d4science.org/D-Net/dnet-hadoop into eosc_dimitris
|
2022-04-21 12:10:19 +02:00 |
Miriam Baglioni
|
b61efd613b
|
[Measures] addressed comments in the PR
|
2022-04-21 12:09:37 +02:00 |
Miriam Baglioni
|
d012d125d7
|
[EOSCTag] -
|
2022-04-21 12:02:09 +02:00 |
Claudio Atzori
|
88acad76f9
|
Merge branch 'beta' into eosc_dimitris
|
2022-04-21 12:00:03 +02:00 |
Claudio Atzori
|
eabb40fccc
|
Merge branch 'beta' into 7096-fileGZip-collector-plugin
|
2022-04-21 11:42:43 +02:00 |
Miriam Baglioni
|
c304657d91
|
[Measures] put the logic in common, no need to change the schema
|
2022-04-21 11:27:26 +02:00 |
Sandro La Bruzzo
|
d580e15442
|
Modified last intersection since we lost many titles.
this is my last resource, after that, I've to change my job
|
2022-04-21 11:06:08 +02:00 |
Miriam Baglioni
|
5295effc96
|
[Measures] fixed issue
|
2022-04-20 16:20:40 +02:00 |
Miriam Baglioni
|
a38f0f5ea7
|
mergin with branch beta
|
2022-04-20 15:44:18 +02:00 |
Miriam Baglioni
|
dbfbe8841a
|
[Clean Context] changed the description in input parameters
|
2022-04-20 15:41:03 +02:00 |
Miriam Baglioni
|
5feae77937
|
[Measures] last changes to accomodate tests
|
2022-04-20 15:13:09 +02:00 |
Miriam Baglioni
|
869407c6e2
|
[Measures] added new measure (usagecounts) as action set. Measure added at the level of the result. Ref #7587
|
2022-04-20 14:02:05 +02:00 |
Antonis Lempesis
|
b7cd2c6ca1
|
added open citations
|
2022-04-20 14:46:55 +03:00 |
Michele Artini
|
c96a8613f8
|
update SQL queries
|
2022-04-20 12:07:49 +02:00 |
Michele Artini
|
4314db55c8
|
migration to services: update sql queries
|
2022-04-19 15:05:02 +02:00 |
Miriam Baglioni
|
0012e57bf9
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2022-04-14 14:14:44 +02:00 |
Miriam Baglioni
|
c5a863132c
|
[BulkTagging] revert it
|
2022-04-14 14:14:13 +02:00 |
Sandro La Bruzzo
|
d5b29d96a7
|
fix merging in crossrefAggregator which creates dataInfo null
|
2022-04-14 11:07:04 +02:00 |
Miriam Baglioni
|
8e8933d41a
|
[BulkTagging] added fix if result.dataInfo is null
|
2022-04-14 09:04:24 +02:00 |
Claudio Atzori
|
b93a141d6c
|
[Doiboost] fixed fundingReference extraction from the Crossref records
|
2022-04-12 10:26:05 +02:00 |
Claudio Atzori
|
73c172926a
|
[Doiboost] fixed fundingReference extraction from the Crossref records
|
2022-04-12 10:25:42 +02:00 |
Claudio Atzori
|
48b580b45c
|
[graph enrichment] fixed country_propagation oozie workflow definition, parameter saveGraph is not needed anymore by the SparkCountryPropagationJob
|
2022-04-11 08:52:36 +02:00 |
Claudio Atzori
|
21f32b83c6
|
[graph enrichment] fixed country_propagation oozie workflow definition, parameter saveGraph is not needed anymore by the SparkCountryPropagationJob
|
2022-04-11 08:52:12 +02:00 |
Claudio Atzori
|
4eff7856f5
|
Merge pull request '[stats-wf] computing stats in each step' (#210) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#210
|
2022-04-08 14:21:01 +02:00 |
Serafeim Chatzopoulos
|
d0b84d3297
|
Add FileCollectorPlugin and respective test
|
2022-04-07 15:06:38 +03:00 |
Serafeim Chatzopoulos
|
bc1bf55507
|
Add AbstractSplittedRecordPlugin
|
2022-04-07 14:33:04 +03:00 |
Claudio Atzori
|
c26222623f
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 13:32:22 +02:00 |
Claudio Atzori
|
86585a6b27
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 13:32:19 +02:00 |
Claudio Atzori
|
ad85d88eaf
|
[maven-release-plugin] rollback the release of dhp-1.2.4
|
2022-04-07 13:28:35 +02:00 |
Claudio Atzori
|
598e11dfd7
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 13:27:02 +02:00 |
Claudio Atzori
|
db3d9877a5
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 13:26:58 +02:00 |
Claudio Atzori
|
3bba6d6e38
|
[maven-release-plugin] rollback the release of dhp-1.2.4
|
2022-04-07 12:23:17 +02:00 |
Claudio Atzori
|
2ac2d928bd
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 12:18:47 +02:00 |
Claudio Atzori
|
85bc722ff4
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 12:18:43 +02:00 |
Claudio Atzori
|
bc05b6168a
|
[maven-release-plugin] rollback the release of dhp-1.2.4
|
2022-04-07 11:49:06 +02:00 |
Claudio Atzori
|
505420fd61
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 11:34:06 +02:00 |
Claudio Atzori
|
66e718981e
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 11:34:02 +02:00 |
Serafeim Chatzopoulos
|
e612489670
|
Add fileGZip collector plugin and respective test
|
2022-04-06 19:12:44 +03:00 |
Claudio Atzori
|
4190c9f6bc
|
[graph raw] avoid NPEs importing datasource consent fields
|
2022-04-06 15:34:31 +02:00 |
Claudio Atzori
|
05fafa1408
|
[graph raw] avoid NPEs importing datasource consent fields
|
2022-04-06 15:23:50 +02:00 |
Antonis Lempesis
|
c442c91f89
|
computing stats in each step
|
2022-04-06 12:40:02 +03:00 |
Claudio Atzori
|
8c457f1b2c
|
conflicts resolved, merged from beta
|
2022-04-06 10:27:52 +02:00 |
Miriam Baglioni
|
e77d104951
|
[OC] added / to workflow path
|
2022-04-05 15:07:11 +02:00 |
Miriam Baglioni
|
79336d46c5
|
[Clean Context] first naive implementation of a functionality to clean not wanted contextes from one result. This implementation simply verifies the main title of the results start with a given string
|
2022-04-04 15:52:31 +02:00 |
Claudio Atzori
|
873369af1c
|
Merge pull request '[stats wf] added apcs in monitor db' (#207) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#207
|
2022-03-29 15:40:20 +02:00 |
Antonis Lempesis
|
7112806a73
|
views cannot be stored as parquet...
|
2022-03-29 16:37:29 +03:00 |
Antonis Lempesis
|
fff0b3cc19
|
added apcs in monitor db
|
2022-03-29 14:15:31 +03:00 |
Claudio Atzori
|
de85367695
|
Merge pull request '[stats wf] fix: views cannot be stored as parquet...' (#206) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#206
|
2022-03-29 12:51:02 +02:00 |
Antonis Lempesis
|
ee24f3eb2c
|
views cannot be stored as parquet...
|
2022-03-29 13:47:48 +03:00 |
Sandro La Bruzzo
|
1b11010169
|
minor fix
|
2022-03-29 10:59:14 +02:00 |
Claudio Atzori
|
0a0ae84c22
|
[graph raw] DOI based instance URLs on https
|
2022-03-29 10:52:58 +02:00 |
Claudio Atzori
|
9fa3dd78fe
|
Merge pull request '[stats wf] various fixes, organization ids for inst. dashboard' (#205) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#205
|
2022-03-28 22:03:49 +02:00 |
Claudio Atzori
|
96aa2a5d0d
|
Merge branch 'beta' into instance_group_by_url
|
2022-03-28 09:23:52 +02:00 |
Claudio Atzori
|
741bc99c47
|
Merge branch 'beta' into datasource_pdf_consent
|
2022-03-28 09:20:48 +02:00 |
Claudio Atzori
|
61319b2e83
|
updated dhp-schema version; set entity-level dataInfo before & after merging the fields from the group of duplicates
|
2022-03-25 16:38:33 +01:00 |
Antonis Lempesis
|
d8503cd191
|
added moooar organizations
|
2022-03-24 14:02:36 +02:00 |
Miriam Baglioni
|
7b8f85692e
|
[Enrichment country] fixed issues with parameters and workflow args
|
2022-03-23 17:20:23 +01:00 |
Claudio Atzori
|
48d32466e4
|
instances grouped by URL expose only one refereed
|
2022-03-23 14:52:03 +01:00 |
Claudio Atzori
|
f10066547b
|
increased spark.sql.shuffle.partitions in affiliation_from_semrel_propagation
|
2022-03-23 12:22:26 +01:00 |
Claudio Atzori
|
43733c1a18
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-03-23 12:14:27 +01:00 |
Antonis Lempesis
|
62f91b0869
|
cleanup
|
2022-03-22 16:17:49 +02:00 |
Antonis Lempesis
|
2e8394ecf8
|
creating aaall tables as parquet
|
2022-03-22 16:16:08 +02:00 |
Antonis Lempesis
|
dcfbeb8142
|
yet more typos
|
2022-03-21 12:36:03 +02:00 |
Miriam Baglioni
|
89fd275480
|
[HostedByMap] added left over from PR and fixed issue on workflow
|
2022-03-21 09:54:45 +01:00 |
miconis
|
c763aded70
|
dependency updated to the new pace-core version
|
2022-03-16 16:41:50 +01:00 |
miconis
|
c959639bd5
|
dependency updated to the new pace-core version
|
2022-03-15 16:33:03 +01:00 |
Miriam Baglioni
|
0f7d8ca2e0
|
[HostedByMap] change on master to align to PR 201 on beta merged as 9f3036c847
|
2022-03-11 15:16:02 +01:00 |
Claudio Atzori
|
f430029596
|
cleanup
|
2022-03-11 14:28:28 +01:00 |
Miriam Baglioni
|
12de9acb0d
|
[Country Propagation] left out from previous commit
|
2022-03-11 14:17:02 +01:00 |
Miriam Baglioni
|
2fbb35ade5
|
mergin with branch beta
|
2022-03-11 13:58:10 +01:00 |
Miriam Baglioni
|
4437f9345d
|
[Country Propagation] left out from previous commit
|
2022-03-11 13:57:47 +01:00 |
Miriam Baglioni
|
2b643059fa
|
[Country Propagation] changed the logic to get the collectedfrom at the result level. To fix issue when no instance is created for a result that should have the country associated. Change the code to use spark instead of hive to prepare the data needed for the propagation step. Added new tests for the intermediate steps and new verification for the propagation itself
|
2022-03-11 13:56:48 +01:00 |
Claudio Atzori
|
f25407bbe2
|
added mapping for datasource consent fields to integrate them in the graph
|
2022-03-11 09:32:42 +01:00 |
Miriam Baglioni
|
2c5087d55a
|
[HostedByMap] download of doaj from json, modification of test resources, deletion of class no more needed for the CSV download
|
2022-03-04 15:18:21 +01:00 |
Miriam Baglioni
|
5d608d6291
|
[HostedByMap] changed the model to include also oaStart date and review process that could be possibly used in the future
|
2022-03-04 11:06:09 +01:00 |
Miriam Baglioni
|
b7c2340952
|
[HostedByMap - DOIBoost] changed to use code moved to common since used also from hostedbymap now
|
2022-03-04 11:05:23 +01:00 |
Miriam Baglioni
|
8a41f63348
|
[HostedByMap] update to download the json instead of the csv
|
2022-03-04 10:38:43 +01:00 |
Miriam Baglioni
|
44b0c03080
|
[HostedByMap] update to download the json instead of the csv
|
2022-03-04 10:37:59 +01:00 |
Antonis Lempesis
|
ad78e505da
|
yet another fix
|
2022-03-03 12:28:12 +02:00 |
Miriam Baglioni
|
3be8737c32
|
[graph-stats] fixed query after the change in the indicator table related to PR#200
|
2022-03-02 14:09:05 +01:00 |
Miriam Baglioni
|
3970651ee1
|
Merge pull request 'fixed query after the change in the indicator table' (#200) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#200
|
2022-03-02 14:05:58 +01:00 |
Antonis Lempesis
|
efeeebfee1
|
fixed query after the change in the indicator table
|
2022-03-02 13:29:25 +02:00 |
Claudio Atzori
|
580d904aae
|
manually merging PR#199 D-Net/dnet-hadoop#199
|
2022-02-25 12:22:50 +01:00 |
Claudio Atzori
|
1932a65d1c
|
Merge pull request '[Stats wf] sprint 6 indicators' (#198) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#198
|
2022-02-25 12:09:18 +01:00 |
Miriam Baglioni
|
f5b0a6f89c
|
[master to beta] fixed issues in test files
|
2022-02-25 10:21:57 +01:00 |
miconis
|
8991d097b4
|
bug fix in the DedupRecordFactory, DataInfo set before merge
|
2022-02-24 17:13:12 +01:00 |
miconis
|
fe1c966cbf
|
Merge branch 'master_202203' of code-repo.d4science.org:D-Net/dnet-hadoop into master_202203
|
2022-02-24 17:08:38 +01:00 |
miconis
|
b0f369dc78
|
bug fix in the DedupRecordFactory, DataInfo set before merge
|
2022-02-24 17:08:24 +01:00 |
Miriam Baglioni
|
859cb7ac9d
|
[DoiBoost AR] changed test resource to be sure the result will always have EMBARGO as value for AccessRight
|
2022-02-24 16:55:32 +01:00 |
Miriam Baglioni
|
a40b59b7d5
|
[ResultToOrgFromInstRepoTest] fixed issue in model of the input resources
|
2022-02-24 16:05:57 +01:00 |
Claudio Atzori
|
66c09b1bc7
|
code formatting
|
2022-02-24 12:58:07 +01:00 |
Claudio Atzori
|
a87c070447
|
conflicts resolved, merged from beta
|
2022-02-24 12:51:31 +01:00 |
Claudio Atzori
|
86cdb7a38f
|
[provision] serialize measures defined on the result level
|
2022-02-23 15:54:18 +01:00 |
Alessia Bardi
|
9d6203f79b
|
test mapping datasource
|
2022-02-23 15:00:53 +01:00 |
Antonis Lempesis
|
3b92a2ab9c
|
added the rest of spring 6 in monitor db
|
2022-02-23 12:05:57 +02:00 |
Antonis Lempesis
|
87c91f70a2
|
added sprint 6 indicators to monitor db
|
2022-02-22 14:41:48 +02:00 |
Claudio Atzori
|
5226d0a100
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2022-02-18 15:21:07 +01:00 |
Claudio Atzori
|
99f5b14469
|
[graph raw] invisible records stored among the raw graph rather than the claimed subgraph
|
2022-02-18 15:20:57 +01:00 |
Claudio Atzori
|
401dd38074
|
code formatting
|
2022-02-18 15:19:15 +01:00 |
Claudio Atzori
|
cf8443780e
|
added processingchargeamount to the result view
|
2022-02-18 15:17:48 +01:00 |
Sandro La Bruzzo
|
891781ee3f
|
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta
|
2022-02-18 11:11:32 +01:00 |
Sandro La Bruzzo
|
d3f03abd51
|
fixed wrong json path
|
2022-02-18 11:11:17 +01:00 |
Claudio Atzori
|
89c7313fc5
|
Merge branch 'beta' into hierarchical_orgs_relations
|
2022-02-17 10:30:04 +01:00 |
dimitrispie
|
58c59f46eb
|
Added Sprint 6
|
2022-02-17 10:21:09 +02:00 |
Antonis Lempesis
|
5772f92dba
|
merged beta chnages in hive branch
|
2022-02-15 13:24:51 +02:00 |
Antonis Lempesis
|
393a4ee956
|
fixed yet another typo...
|
2022-02-15 12:56:50 +02:00 |
Sandro La Bruzzo
|
3aa2020b24
|
added script to regenerate hostedBy Map following instruction defined on ticket #7539
updated hosted By Map
|
2022-02-15 11:05:27 +01:00 |
Miriam Baglioni
|
be64055cfe
|
[OpenCitation] changed the name of destination folders
|
2022-02-14 15:49:44 +01:00 |
Miriam Baglioni
|
1490867cc7
|
[OpenCitation] cleaning of the COCI model
|
2022-02-14 14:52:12 +01:00 |
Miriam Baglioni
|
c191080965
|
mergin with branch beta
|
2022-02-14 14:49:39 +01:00 |
Alessia Bardi
|
600ede1798
|
serialisation of APCs int he XML records
|
2022-02-11 11:00:20 +01:00 |
Miriam Baglioni
|
5c4043dba8
|
[OpenCitation] refactoring
|
2022-02-08 16:23:05 +01:00 |
Miriam Baglioni
|
759ed519f2
|
[OpenCitation] added logic to avoid the genration of self citations relations
|
2022-02-08 16:15:34 +01:00 |
Miriam Baglioni
|
b071f8e415
|
[OpenCitation] change to extract in json format each folder just onece
|
2022-02-08 15:37:28 +01:00 |
Miriam Baglioni
|
fbc28ee8c3
|
[OpenCitation] change the integration logic to consider dois with commas inside
|
2022-02-07 18:32:08 +01:00 |
Miriam Baglioni
|
78be2975f0
|
[stats-wf]fixed another typo related to PR#193
|
2022-02-07 11:22:08 +01:00 |
Miriam Baglioni
|
1f8302dc37
|
Merge pull request '[stats-wf]fixed yet another typo' (#193) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#193
|
2022-02-07 11:19:26 +01:00 |
Antonis Lempesis
|
5f762cbd09
|
fixed yet another typo
|
2022-02-07 12:09:12 +02:00 |
Alessia Bardi
|
ac8b8f224f
|
Merge branch 'beta' into extendResult
|
2022-02-04 16:43:27 +01:00 |
Miriam Baglioni
|
493caef358
|
[stats-wf]fixed the result_result table related to PR#191
|
2022-02-04 14:51:25 +01:00 |
Miriam Baglioni
|
0547fd6ee7
|
Merge pull request '[stats-wf]fixed the result_result table' (#191) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#191
|
2022-02-04 14:47:31 +01:00 |
Antonis Lempesis
|
ae633c566b
|
fixed the result_result table
|
2022-02-04 15:04:19 +02:00 |
Miriam Baglioni
|
aae667e6b6
|
[APC at the result level] added the APC at the level of the result and modified test class
|
2022-02-04 12:34:25 +01:00 |
Sandro La Bruzzo
|
bcfdf9a0d7
|
iis repository with https
|
2022-02-03 16:49:31 +01:00 |
Miriam Baglioni
|
3c60e53a96
|
[stats-wf]fixed the result_result creation for monitor PR#190 on beta
|
2022-02-03 14:47:08 +01:00 |
Miriam Baglioni
|
89922156c9
|
Merge pull request '[stats-wf]fixed the result_result creation for monitor' (#190) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#190
|
2022-02-03 13:00:56 +01:00 |
Antonis Lempesis
|
c2b44530a3
|
typo...
|
2022-02-03 13:44:07 +02:00 |
Antonis Lempesis
|
dbd2646d59
|
fixed the result_result creation for monitor
|
2022-02-03 12:37:10 +02:00 |
Alessia Bardi
|
2e215abfa8
|
test for instances with URLs for OpenAPC
|
2022-02-02 17:27:44 +01:00 |
Miriam Baglioni
|
37784209c9
|
[dhp-schemas-] updated the version of dhp-schema to 2.10.27 for APC name and id modification
|
2022-02-02 12:46:31 +01:00 |
Miriam Baglioni
|
73eba34d42
|
[UnresolvedEntities] Changed the way to merge the unresolved because the new merge removed the dataInfo from the merged result. Added also data info for subjects
|
2022-02-01 08:38:41 +01:00 |
Miriam Baglioni
|
dce7f5fea8
|
[BULK TAGGING] changed to fix issue that should have been fixed already
|
2022-01-31 08:20:28 +01:00 |
Claudio Atzori
|
8eb75ca169
|
adapted GenerateEntitiesApplicationTest behaviour
|
2022-01-27 16:24:37 +01:00 |
Claudio Atzori
|
af61e44acc
|
ported changes to the GraphCleaningFunctionsTest from 8de9788308
|
2022-01-27 16:19:14 +01:00 |
Claudio Atzori
|
1322379741
|
Merge branch 'beta' into delegated_authorities
|
2022-01-25 14:28:25 +01:00 |
Claudio Atzori
|
59a250337c
|
[graph resolution] drop output path at the beginning
|
2022-01-24 18:02:39 +01:00 |
Claudio Atzori
|
97ad94d7d9
|
[graph resolution] drop output path at the beginning
|
2022-01-24 18:02:07 +01:00 |
Claudio Atzori
|
8de9788308
|
applied fix for avoiding ruling out the invisible (APC) records during the graph cleaning
|
2022-01-24 11:29:22 +01:00 |
Claudio Atzori
|
2f385b3ac6
|
updated dnet workflow profile definitions
|
2022-01-21 13:59:46 +01:00 |
Claudio Atzori
|
dd52bf1bb8
|
copy relations to the graphOutputPath
|
2022-01-21 13:59:29 +01:00 |
Claudio Atzori
|
4983d6536d
|
Merge branch 'beta' into delegated_authorities
|
2022-01-21 13:02:48 +01:00 |
Claudio Atzori
|
f0ea2410e5
|
improved mapping titles from datacite records to consider title types
|
2022-01-21 10:50:34 +01:00 |
Claudio Atzori
|
b37bc277c4
|
reintroduced the hostedby patching to the datacite records
|
2022-01-21 09:15:13 +01:00 |
Claudio Atzori
|
f2fde5566b
|
using helper method from ModelSupport to find the inverse relation descriptor
|
2022-01-20 09:19:07 +01:00 |
Claudio Atzori
|
3b9020c1b7
|
added unit test for the DispatchEntitiesJob
|
2022-01-19 18:15:55 +01:00 |
Claudio Atzori
|
abfa9c6045
|
code formatting
|
2022-01-19 17:17:11 +01:00 |
Claudio Atzori
|
391aa1373b
|
added unit test
|
2022-01-19 17:13:21 +01:00 |
Claudio Atzori
|
44a937f4ed
|
factored out entity grouping implementation, extended to consider results from delegated authorities rather than identical records from other sources
|
2022-01-19 12:24:52 +01:00 |
Miriam Baglioni
|
a7c4d0d16d
|
[DoiBoost Organizations] added parameter to specify the action in the wf raw_organizations to be able to load the openorgs organization as in the loading step for the construction of the graph
|
2022-01-13 13:52:00 +01:00 |
Miriam Baglioni
|
a75fb8c47a
|
[BipFinderInstanceLevel] change pom to align to the dhp-schema release 2.10.24 and refactoring
|
2022-01-12 18:06:26 +01:00 |
Miriam Baglioni
|
e7d5a39c03
|
[BipFinderInstanceLevel] added tests in test class
|
2022-01-12 17:25:04 +01:00 |
Miriam Baglioni
|
4993666d73
|
[BipFinderInstanceLevel] changed creation of the instance to allow to enrich existing instances with same pid
|
2022-01-12 16:53:47 +01:00 |
Claudio Atzori
|
9acc32faa6
|
[stats wf] final touches for the integration of PRs #166, #179 in the master branch
|
2022-01-12 12:04:31 +01:00 |
dimitrispie
|
b053b0178e
|
Sprint 5 and other changes
|
2022-01-12 11:23:37 +01:00 |
Antonis Lempesis
|
b6b4bc0df9
|
added first indicator of sprint 5
|
2022-01-12 11:20:28 +01:00 |
Antonis Lempesis
|
e91f06f39b
|
fixed typos in indicators. Added extra views in monitor
|
2022-01-12 11:18:28 +01:00 |
Antonis Lempesis
|
3ce1976627
|
fixed column names
|
2022-01-12 11:14:41 +01:00 |
Antonis Lempesis
|
4878d7485c
|
added usage stats
|
2022-01-12 11:13:25 +01:00 |
Antonis Lempesis
|
a4316bafed
|
fixed a typo
|
2022-01-12 11:12:53 +01:00 |
Antonis Lempesis
|
bb17e070d8
|
added result_result relations
|
2022-01-12 11:09:38 +01:00 |
Claudio Atzori
|
a30a98a716
|
Applying PR#166 in the master branch (Added sprint 3&4 of indicators). Merge commit '0df9574a6f5d9d75bc840decb023561ae941f9d6'
|
2022-01-12 10:57:19 +01:00 |
Sandro La Bruzzo
|
57e2c4b749
|
formatted code
|
2022-01-12 09:40:28 +01:00 |
Claudio Atzori
|
0f2144b5e0
|
scalafmt: code formatting
|
2022-01-11 17:03:44 +01:00 |
Claudio Atzori
|
dcd282977c
|
pulled from beta
|
2022-01-11 16:59:41 +01:00 |
Claudio Atzori
|
4f212652ca
|
scalafmt: code formatting
|
2022-01-11 16:57:48 +01:00 |
Sandro La Bruzzo
|
0163dadb7f
|
[doiboost]
- update MAG schema, new filed added on version dec-2021
|
2022-01-11 11:05:44 +01:00 |
Miriam Baglioni
|
904e1c2667
|
Merge pull request 'Affiliation Propagation through semantic relation' (#183) from enrichment into beta
Reviewed-on: D-Net/dnet-hadoop#183
|
2022-01-07 19:18:16 +01:00 |
Miriam Baglioni
|
064f9bbd87
|
[AFFPropSR] added new paprameter for the number of iterations and new code for just one iteration
|
2022-01-07 18:58:51 +01:00 |
Miriam Baglioni
|
b7e450070b
|
[SDG-FOS] to import SDG file not considering the header
|
2022-01-07 12:13:26 +01:00 |
Miriam Baglioni
|
639190370a
|
mergin with branch beta
|
2022-01-07 11:29:25 +01:00 |
Miriam Baglioni
|
adccc2346a
|
[SDG-FOS] to lower case for the doi
|
2022-01-07 11:28:50 +01:00 |
Claudio Atzori
|
8ae46ca789
|
OAF-store-graph mdstores: firther fix for PR#180
|
2022-01-05 15:52:15 +01:00 |
Claudio Atzori
|
908294d86e
|
OAF-store-graph mdstores: firther fix for PR#180
|
2022-01-05 15:49:05 +01:00 |
Claudio Atzori
|
3bd3653be9
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 16:39:39 +01:00 |
Claudio Atzori
|
3dc48c7ab5
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 16:39:27 +01:00 |
Claudio Atzori
|
f82db765db
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 16:39:15 +01:00 |
Claudio Atzori
|
8d13effa31
|
test for the tolerant deserialisation utility method
|
2022-01-04 16:38:26 +01:00 |
Claudio Atzori
|
9458ee7938
|
serialise records in the OAF-store-graph mdstores in json format. Read them again in the graph construction phase using a tolerant parser to support backward compatible changes in the evolution of the schema
|
2022-01-04 16:38:09 +01:00 |
Claudio Atzori
|
58f8998e3d
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 15:02:09 +01:00 |
Claudio Atzori
|
174c3037e1
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 14:40:16 +01:00 |
Claudio Atzori
|
045d767013
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 14:23:01 +01:00 |
Claudio Atzori
|
bd59b58efb
|
test for the tolerant deserialisation utility method
|
2022-01-04 11:26:56 +01:00 |
Claudio Atzori
|
a6977197b3
|
serialise records in the OAF-store-graph mdstores in json format. Read them again in the graph construction phase using a tolerant parser to support backward compatible changes in the evolution of the schema
|
2022-01-03 17:25:26 +01:00 |
Miriam Baglioni
|
4c60ee1718
|
mergin with branch beta
|
2022-01-03 15:24:02 +01:00 |
Miriam Baglioni
|
92fd69e25d
|
[SDG-FOS] alternative way to get input data to avoid OOM error while getting csv
|
2022-01-03 15:23:06 +01:00 |
Claudio Atzori
|
fe7e5f4748
|
Merge pull request '[stats wf] result_result relations, usage stats, monitor views, indicator for sprint 5' (#179) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#179
|
2022-01-03 14:52:11 +01:00 |
Claudio Atzori
|
bcea4e3a9b
|
added dnet workflow profile for the orchestration of the simplified and complete graph construction and processing pipeline, where the IIS works on the non-deduplicated graph
|
2022-01-03 14:33:00 +01:00 |
Miriam Baglioni
|
a706ba0c08
|
Merge pull request 'SDG Integration' (#178) from SDG into beta
Reviewed-on: D-Net/dnet-hadoop#178
|
2021-12-23 14:50:00 +01:00 |
Antonis Lempesis
|
81ee654271
|
added result_result relations
|
2021-12-23 15:46:17 +02:00 |
Antonis Lempesis
|
7551e52e95
|
fixed a typo
|
2021-12-23 15:33:53 +02:00 |
Miriam Baglioni
|
7a1b440413
|
[SDG] logic to create unresolved entities out of SDG input. This changes also some classes related to FOS to reuse the same code. The code under createunresolvedentities create results with the merged update of the the inputs provided (bip at the level of the isntance, fos and sdg for subjects)
|
2021-12-23 13:24:28 +01:00 |
Claudio Atzori
|
cccb16900c
|
https://support.openaire.eu/issues/7330 normalising DOI urls
|
2021-12-23 12:33:53 +01:00 |
Miriam Baglioni
|
2a67ee13ec
|
[SDG] added model class
|
2021-12-23 10:37:52 +01:00 |
Miriam Baglioni
|
69e9ea9eeb
|
[Graph Dump] Test for extraction of rels from entities extended
|
2021-12-23 10:15:30 +01:00 |
Miriam Baglioni
|
31b26d48ac
|
[Graph Dump] fixed issue on extraction of relation between entities and contexts: the relationship name and type were swapped
|
2021-12-23 10:09:47 +01:00 |
Miriam Baglioni
|
10579c0dd0
|
[FOS]fixed doi value in test
|
2021-12-22 23:10:16 +01:00 |
Miriam Baglioni
|
6116fc5d40
|
[FOS]added logic to include only different subjects. Test refactoring and extention
|
2021-12-22 23:04:22 +01:00 |
Miriam Baglioni
|
b81efb6a9d
|
[FOS]changed the mapping between the csv and the model. Changed Test classes and resources
|
2021-12-22 21:40:35 +01:00 |
Miriam Baglioni
|
de6c4c8968
|
[FOS]creation of the unresolved entities: remove the split for the doi: no more needed since each row is related to one doi
|
2021-12-22 16:44:44 +01:00 |
Miriam Baglioni
|
34ac56565d
|
refactoring
|
2021-12-22 16:28:11 +01:00 |
Miriam Baglioni
|
20ef1d657f
|
refactoring
|
2021-12-22 16:26:36 +01:00 |
Miriam Baglioni
|
813f856d3f
|
[BipFinder] removing left over parameter in wf
|
2021-12-22 16:11:12 +01:00 |
Miriam Baglioni
|
2c126ed014
|
[BipFinder] create unresolved entities with measures at the level of the instance
|
2021-12-22 16:03:41 +01:00 |
Miriam Baglioni
|
0807fdb65a
|
[BipFinder] remove not needed resources
|
2021-12-22 15:37:00 +01:00 |
Miriam Baglioni
|
b5e11a3a0a
|
[BipFinder] put in common package BipFinder model
|
2021-12-22 15:33:05 +01:00 |
Miriam Baglioni
|
c5739c4266
|
[BipFinder] create action set for the measures at the level of the result
|
2021-12-22 15:08:33 +01:00 |
Miriam Baglioni
|
da5f6260aa
|
mergin with branch beta
|
2021-12-22 13:12:02 +01:00 |
Miriam Baglioni
|
be0acccf42
|
Merge branch 'beta' into dump
|
2021-12-22 12:39:57 +01:00 |
Antonis Lempesis
|
16539d7360
|
added usage stats
|
2021-12-22 02:54:42 +02:00 |
Antonis Lempesis
|
3edd661608
|
fixed column names
|
2021-12-21 22:55:04 +02:00 |
Antonis Lempesis
|
a4c0cbb98c
|
fixed typos in indicators. Added extra views in monitor
|
2021-12-21 15:54:38 +02:00 |
Miriam Baglioni
|
e24a7f3496
|
mergin with branch beta
|
2021-12-21 13:57:19 +01:00 |
Miriam Baglioni
|
d1ae219cb4
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-21 13:55:53 +01:00 |
Miriam Baglioni
|
460e6b95d6
|
[Graph Dump] -
|
2021-12-21 13:48:03 +01:00 |
Sandro La Bruzzo
|
3920d68992
|
Fixed workflow generation of delta in datacite
|
2021-12-21 11:41:49 +01:00 |
Antonis Lempesis
|
58996972d9
|
added first indicator of sprint 5
|
2021-12-21 03:35:04 +02:00 |
dimitrispie
|
c1cdec09a9
|
Sprint 5 and other changes
|
2021-12-20 19:23:57 +02:00 |
Miriam Baglioni
|
3cc1b7b153
|
mergin with branch beta
|
2021-12-15 17:25:02 +01:00 |
Miriam Baglioni
|
63b648b0dd
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-15 12:41:15 +01:00 |
Antonis Lempesis
|
f0b523cfa7
|
removed the too restrctive clause. will discuss again
|
2021-12-15 12:32:15 +01:00 |
Sandro La Bruzzo
|
b881ee5ef8
|
[scholexplorer]
- implemented generation of scholix of delta update of datacite
|
2021-12-15 11:25:32 +01:00 |
Sandro La Bruzzo
|
63952018c0
|
[scholexplorer]
-moved SparkRetrieveDataciteDelta in scala folder
|
2021-12-15 11:25:32 +01:00 |
Sandro La Bruzzo
|
e5bff64f2e
|
[scholexplorer]
- Minor fix on SparkConvertRDDtoDataset
-first implementation of retrieve datacite dump
|
2021-12-15 11:25:32 +01:00 |
Claudio Atzori
|
1790fa2d44
|
Merge branch 'beta' into affiliationPropagation
|
2021-12-14 15:26:56 +01:00 |
Miriam Baglioni
|
56409d1281
|
[Dump] resolved conflicts with beta and merging
|
2021-12-14 15:03:45 +01:00 |
Miriam Baglioni
|
22d4b5619b
|
[BipFinder Result] last changes to test and resources files
|
2021-12-14 14:54:13 +01:00 |
Miriam Baglioni
|
6fb6236cd4
|
changed the way to produce the AS for bipFinder.
|
2021-12-14 14:51:14 +01:00 |
Miriam Baglioni
|
573bd17cbb
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-14 11:12:25 +01:00 |
Miriam Baglioni
|
4eb8276493
|
-
|
2021-12-14 11:12:17 +01:00 |
Antonis Lempesis
|
ddd34087c2
|
removed 'stored as parquet' from views..
|
2021-12-13 23:05:00 +02:00 |
Antonis Lempesis
|
915f758c82
|
moving data to impala cluster and creating shadow databases there
|
2021-12-13 16:26:14 +02:00 |
Miriam Baglioni
|
936578aaf1
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-13 15:01:47 +01:00 |
Miriam Baglioni
|
8d755cca80
|
-
|
2021-12-13 15:01:40 +01:00 |
Claudio Atzori
|
98eb292c59
|
avoid NPEs merging XMLInstance(s)
|
2021-12-13 13:27:20 +01:00 |
Claudio Atzori
|
5e17247bb6
|
avoid NPEs merging XMLInstance(s)
|
2021-12-13 11:48:40 +01:00 |
Claudio Atzori
|
b70ecccea0
|
avoid NPEs merging XMLInstance(s)
|
2021-12-12 12:37:38 +01:00 |
Claudio Atzori
|
c1b6ae47cd
|
cleaning workflow assigns the proper default instance type when a value could not be cleaned using the vocabularies
|
2021-12-09 16:47:41 +01:00 |