Claudio Atzori
|
b502f86523
|
fixed input path supplemented to GetDatasourceFromCountry; adjusted the various spark.sql.shuffle.partitions
|
2023-03-24 13:09:12 +01:00 |
Claudio Atzori
|
c07857fa37
|
[graph cleaning] unit tests & cleanup
|
2023-03-23 15:57:47 +01:00 |
Claudio Atzori
|
90e61a8aba
|
[graph cleaning] WIP: refactoring of the cleaning stages, unit tests
|
2023-03-23 15:03:26 +01:00 |
Claudio Atzori
|
488d9a5eaa
|
[graph cleaning] WIP: refactoring of the cleaning stages, unit tests
|
2023-03-23 10:41:13 +01:00 |
Claudio Atzori
|
4f5ba0ed52
|
[graph cleaning] WIP: refactoring of the cleaning stages, unit tests
|
2023-03-21 14:41:20 +01:00 |
Claudio Atzori
|
6d3d18d8b5
|
[graph cleaning] WIP: refactoring of the cleaning stages
|
2023-03-16 17:23:36 +01:00 |
Claudio Atzori
|
518618f1a9
|
[graph cleaning] avoid to overwrite the subject class to 'keyword' for those with provenance 'subject:fos'
|
2023-03-14 15:22:47 +01:00 |
Claudio Atzori
|
41e00bcd07
|
[graph provision] avoid to parse again the XML records, apparently the escaped XML characters get unescaped invalidating the record
|
2023-03-13 15:19:49 +01:00 |
Claudio Atzori
|
24e2fd828b
|
code formatting
|
2023-03-08 21:17:08 +01:00 |
Claudio Atzori
|
e28d395e87
|
[aggregator graph] using dedicated path to sync claims, adjusted paths with wildcards
|
2023-03-08 21:16:52 +01:00 |
Claudio Atzori
|
5b8fd37314
|
[aggregator graph] using dedicated path to sync claims
|
2023-03-08 15:28:14 +01:00 |
Claudio Atzori
|
7fd89566c2
|
[aggregator graph] handle paths including wildcards
|
2023-03-08 12:43:00 +01:00 |
Miriam Baglioni
|
588aca5ce4
|
Merge pull request 'h2020classification' (#280) from h2020classification into beta
Reviewed-on: D-Net/dnet-hadoop#280
|
2023-03-03 09:29:10 +01:00 |
Claudio Atzori
|
8ec0d62d91
|
pre-group the records in each table before joning the contents from BETA and PROD together
|
2023-03-02 14:49:19 +01:00 |
Miriam Baglioni
|
0fff98a14c
|
[ECclassification] removed print
|
2023-03-02 11:46:57 +01:00 |
Miriam Baglioni
|
b0c2f7e526
|
[ECclassification] removed not needed resources
|
2023-03-02 11:44:48 +01:00 |
Miriam Baglioni
|
d4fc62c2f6
|
mergin with branch beta
|
2023-03-02 11:14:54 +01:00 |
Miriam Baglioni
|
de8ad1caef
|
[ECclassification] new implementation for the H2020 classification
|
2023-03-02 11:14:03 +01:00 |
Claudio Atzori
|
db9dad4aa7
|
[actionmanager] increased spark.sql.shuffle.partitions for publication, dataset, relation records
|
2023-03-02 09:11:37 +01:00 |
Miriam Baglioni
|
c1f9848953
|
[ECclassification] added new classes
|
2023-03-01 15:29:11 +01:00 |
Claudio Atzori
|
6f488547a7
|
ignore non processable records
|
2023-03-01 14:49:51 +01:00 |
Claudio Atzori
|
7d263f265e
|
adjusted logs
|
2023-03-01 11:58:07 +01:00 |
Claudio Atzori
|
16ad42e8f3
|
code formatting
|
2023-03-01 10:22:13 +01:00 |
Claudio Atzori
|
9c59dac859
|
followup changes reorganising the mdstore synchronisation mechanism
|
2023-03-01 10:16:20 +01:00 |
Miriam Baglioni
|
ad745c0aa3
|
[CrossrefFunderMapping] fixed issueson funder name
|
2023-02-28 14:58:27 +01:00 |
Miriam Baglioni
|
4f2df876cd
|
[ECclassification] new implementation first try
|
2023-02-28 14:44:00 +01:00 |
Claudio Atzori
|
2f7346e9cf
|
WIP monodirectional citations, Datacite
|
2023-02-28 13:30:51 +01:00 |
Claudio Atzori
|
0559d8b412
|
WIP monodirectional citations
|
2023-02-28 10:57:32 +01:00 |
Sandro La Bruzzo
|
69fa616490
|
removed wrong content
|
2023-02-28 10:27:38 +01:00 |
Sandro La Bruzzo
|
832a75d012
|
added mapping for crossref funder
|
2023-02-28 10:16:34 +01:00 |
Sandro La Bruzzo
|
78e51c182a
|
Added missing parametero to raw all workflow
|
2023-02-28 10:16:01 +01:00 |
Claudio Atzori
|
7aebedb43c
|
code formatting
|
2023-02-27 11:51:27 +01:00 |
Miriam Baglioni
|
80987801d7
|
[FoS] added check for null on level1 subject
|
2023-02-27 11:40:22 +01:00 |
Claudio Atzori
|
31e97c2a6b
|
[unresolved entities] updated oozie wf node labels
|
2023-02-27 11:38:29 +01:00 |
Miriam Baglioni
|
23112929e9
|
[FoS] changed the default separator from comma to tab to solve the issue in subject value split
|
2023-02-27 10:18:39 +01:00 |
Serafeim Chatzopoulos
|
0b5bf53b45
|
Remove unecessary indexed fields from Solr
|
2023-02-23 12:42:42 +02:00 |
Michele Artini
|
fddcf701e9
|
updated the order of the compatibilities
|
2023-02-22 12:07:09 +01:00 |
Claudio Atzori
|
0c1be41b30
|
code formatting
|
2023-02-22 10:15:25 +01:00 |
Claudio Atzori
|
99cd7761aa
|
cleanup of non necessary dhp-monitor-update workflow
|
2023-02-22 10:10:22 +01:00 |
Claudio Atzori
|
cd3a51a15f
|
Merge branch 'beta' into 8232-mdstore-synch-improve
|
2023-02-22 09:57:07 +01:00 |
Claudio Atzori
|
477a7c416f
|
Merge branch 'beta' into UsageCountOnProjectAndDatasource
|
2023-02-22 09:55:51 +01:00 |
Claudio Atzori
|
c20c1c9159
|
Merge pull request 'Added 4 institutions:' (#261) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#261
|
2023-02-22 09:53:45 +01:00 |
Miriam Baglioni
|
d617c3e812
|
[DOIBoost] extended mapping for funder #8407
|
2023-02-20 14:45:27 +01:00 |
Miriam Baglioni
|
016337a0f9
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2023-02-16 15:54:59 +01:00 |
Sandro La Bruzzo
|
118c1fc3b3
|
Merge remote-tracking branch 'origin/beta' into beta
|
2023-02-15 10:29:28 +01:00 |
Sandro La Bruzzo
|
a8ac79fa25
|
Added citation relation on crossref Mapping
|
2023-02-15 10:29:13 +01:00 |
Claudio Atzori
|
9a03f71db1
|
code formatting
|
2023-02-13 16:25:47 +01:00 |
Michele Artini
|
554df257ab
|
null values in date range conditions
|
2023-02-13 16:15:32 +01:00 |
Miriam Baglioni
|
5cf902a2b0
|
[UsageCount] changed query to make the sum be computed via sql instead of grouping
|
2023-02-10 16:16:37 +01:00 |
Miriam Baglioni
|
f803530df6
|
[UsageCount] fixed query
|
2023-02-10 15:50:56 +01:00 |