Commit Graph

3661 Commits

Author SHA1 Message Date
Claudio Atzori 4f5ba0ed52 [graph cleaning] WIP: refactoring of the cleaning stages, unit tests 2023-03-21 14:41:20 +01:00
Claudio Atzori 6d3d18d8b5 [graph cleaning] WIP: refactoring of the cleaning stages 2023-03-16 17:23:36 +01:00
Claudio Atzori 518618f1a9 [graph cleaning] avoid to overwrite the subject class to 'keyword' for those with provenance 'subject:fos' 2023-03-14 15:22:47 +01:00
Claudio Atzori 41e00bcd07 [graph provision] avoid to parse again the XML records, apparently the escaped XML characters get unescaped invalidating the record 2023-03-13 15:19:49 +01:00
Claudio Atzori 24e2fd828b code formatting 2023-03-08 21:17:08 +01:00
Claudio Atzori e28d395e87 [aggregator graph] using dedicated path to sync claims, adjusted paths with wildcards 2023-03-08 21:16:52 +01:00
Claudio Atzori 5b8fd37314 [aggregator graph] using dedicated path to sync claims 2023-03-08 15:28:14 +01:00
Claudio Atzori 7fd89566c2 [aggregator graph] handle paths including wildcards 2023-03-08 12:43:00 +01:00
Miriam Baglioni 588aca5ce4 Merge pull request 'h2020classification' (#280) from h2020classification into beta
Reviewed-on: #280
2023-03-03 09:29:10 +01:00
Claudio Atzori 8ec0d62d91 pre-group the records in each table before joning the contents from BETA and PROD together 2023-03-02 14:49:19 +01:00
Miriam Baglioni 0fff98a14c [ECclassification] removed print 2023-03-02 11:46:57 +01:00
Miriam Baglioni b0c2f7e526 [ECclassification] removed not needed resources 2023-03-02 11:44:48 +01:00
Miriam Baglioni d4fc62c2f6 mergin with branch beta 2023-03-02 11:14:54 +01:00
Miriam Baglioni de8ad1caef [ECclassification] new implementation for the H2020 classification 2023-03-02 11:14:03 +01:00
Claudio Atzori db9dad4aa7 [actionmanager] increased spark.sql.shuffle.partitions for publication, dataset, relation records 2023-03-02 09:11:37 +01:00
Miriam Baglioni c1f9848953 [ECclassification] added new classes 2023-03-01 15:29:11 +01:00
Claudio Atzori 6f488547a7 ignore non processable records 2023-03-01 14:49:51 +01:00
Claudio Atzori 7d263f265e adjusted logs 2023-03-01 11:58:07 +01:00
Claudio Atzori 16ad42e8f3 code formatting 2023-03-01 10:22:13 +01:00
Claudio Atzori 9c59dac859 followup changes reorganising the mdstore synchronisation mechanism 2023-03-01 10:16:20 +01:00
Miriam Baglioni ad745c0aa3 [CrossrefFunderMapping] fixed issueson funder name 2023-02-28 14:58:27 +01:00
Miriam Baglioni 4f2df876cd [ECclassification] new implementation first try 2023-02-28 14:44:00 +01:00
Claudio Atzori 2f7346e9cf WIP monodirectional citations, Datacite 2023-02-28 13:30:51 +01:00
Claudio Atzori 0559d8b412 WIP monodirectional citations 2023-02-28 10:57:32 +01:00
Sandro La Bruzzo 69fa616490 removed wrong content 2023-02-28 10:27:38 +01:00
Sandro La Bruzzo 832a75d012 added mapping for crossref funder 2023-02-28 10:16:34 +01:00
Sandro La Bruzzo 78e51c182a Added missing parametero to raw all workflow 2023-02-28 10:16:01 +01:00
Claudio Atzori 7aebedb43c code formatting 2023-02-27 11:51:27 +01:00
Miriam Baglioni 80987801d7 [FoS] added check for null on level1 subject 2023-02-27 11:40:22 +01:00
Claudio Atzori 31e97c2a6b [unresolved entities] updated oozie wf node labels 2023-02-27 11:38:29 +01:00
Miriam Baglioni 23112929e9 [FoS] changed the default separator from comma to tab to solve the issue in subject value split 2023-02-27 10:18:39 +01:00
Serafeim Chatzopoulos 0b5bf53b45 Remove unecessary indexed fields from Solr 2023-02-23 12:42:42 +02:00
Michele Artini fddcf701e9 updated the order of the compatibilities 2023-02-22 12:07:09 +01:00
Claudio Atzori 0c1be41b30 code formatting 2023-02-22 10:15:25 +01:00
Claudio Atzori 99cd7761aa cleanup of non necessary dhp-monitor-update workflow 2023-02-22 10:10:22 +01:00
Claudio Atzori cd3a51a15f Merge branch 'beta' into 8232-mdstore-synch-improve 2023-02-22 09:57:07 +01:00
Claudio Atzori 477a7c416f Merge branch 'beta' into UsageCountOnProjectAndDatasource 2023-02-22 09:55:51 +01:00
Claudio Atzori c20c1c9159 Merge pull request 'Added 4 institutions:' (#261) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #261
2023-02-22 09:53:45 +01:00
Miriam Baglioni d617c3e812 [DOIBoost] extended mapping for funder #8407 2023-02-20 14:45:27 +01:00
Miriam Baglioni 016337a0f9 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-02-16 15:54:59 +01:00
Sandro La Bruzzo 118c1fc3b3 Merge remote-tracking branch 'origin/beta' into beta 2023-02-15 10:29:28 +01:00
Sandro La Bruzzo a8ac79fa25 Added citation relation on crossref Mapping 2023-02-15 10:29:13 +01:00
Claudio Atzori 9a03f71db1 code formatting 2023-02-13 16:25:47 +01:00
Michele Artini 554df257ab null values in date range conditions 2023-02-13 16:15:32 +01:00
Miriam Baglioni 5cf902a2b0 [UsageCount] changed query to make the sum be computed via sql instead of grouping 2023-02-10 16:16:37 +01:00
Miriam Baglioni f803530df6 [UsageCount] fixed query 2023-02-10 15:50:56 +01:00
Miriam Baglioni bb5bba51b3 [UsageCount] extended test 2023-02-09 19:08:30 +01:00
Miriam Baglioni 85e53fad00 [UsageCount] addition of usagecount for Projects and datasources. Extention of the action set created for the results with new entities for projects and datasources. Extention of the resource set and modification of the testing class 2023-02-09 18:59:45 +01:00
Sandro La Bruzzo 8920932dd8 Code formatted 2023-02-08 11:34:18 +01:00
Sandro La Bruzzo 0b9819f1ab Code formatted 2023-02-08 10:32:33 +01:00