Commit Graph

399 Commits

Author SHA1 Message Date
Michele Artini 9506d80ddc changed sql to select accepted datasources 2024-03-04 08:25:40 +01:00
Michele Artini c2b6841eb0 opendoar datasource filter 2024-03-01 15:32:56 +01:00
Michele Artini be7f327e88 opendoar datasource filter 2024-03-01 13:38:36 +01:00
Michele Artini 32f4d6f691 reports for types 2024-03-01 11:43:37 +01:00
Michele Artini 71204a8056 some fields in stats 2024-02-29 10:17:31 +01:00
Michele Artini 5ddbef3a5b new stats 2024-02-28 14:34:09 +01:00
Michele Artini 3d14bef381 OpenDoar reports 2024-02-28 10:51:13 +01:00
Michele Artini f8cf7ffbcb stats 2024-02-22 14:01:11 +01:00
Michele Artini d2b7541583 fixed a problem with Dataset model 2024-02-16 11:36:46 +01:00
Michele Artini 8ffdd9747d added id to BaseRecordInfo 2024-02-15 14:27:50 +01:00
Michele Artini da65728afe produce a parquet file 2024-02-15 14:04:17 +01:00
Michele Artini e254720377 fixed path reports 2024-02-15 08:52:28 +01:00
Michele Artini 8d85c1e97e used a parser STAX 2024-02-15 08:21:52 +01:00
Michele Artini b42e2b4d61 fixed log class 2024-02-14 15:52:31 +01:00
Michele Artini 773346f638 increased memory 2024-02-14 14:40:27 +01:00
Michele Artini 2e11197142 removed invalid deletion 2024-02-14 11:59:30 +01:00
Michele Artini ddd6a7ceb3 minor fixes 2024-02-14 11:39:37 +01:00
Michele Artini 963a2500be new reports in hadoop job 2024-02-14 10:37:39 +01:00
Michele Artini 4b1ecad4e2 prepared a job to analyze the BASE records 2024-02-13 13:48:26 +01:00
Michele Artini dd7350ecf2 fixed a problem with xpaths 2024-02-13 08:36:42 +01:00
Michele Artini 265bfd364d refactoing 2024-02-12 15:35:36 +01:00
Michele Artini 16766c514e refactoring 2024-02-12 12:19:57 +01:00
Michele Artini 5add433b74 partial refactoring 2024-02-09 14:33:04 +01:00
Michele Artini c974c75f83 partial refactoring 2024-02-09 12:36:20 +01:00
Michele Artini c6db6335b9 prepare filtering for base import 2024-02-06 15:10:29 +01:00
Michele Artini abcd81bba0 first implementation of the collection plugin for BASE 2024-02-05 15:19:41 +01:00
Miriam Baglioni f612125939 fix issue on FoS integration. Removing the null values from FoS 2024-01-12 10:20:28 +01:00
Sandro La Bruzzo 5e22b67b8a Merge remote-tracking branch 'origin/beta' into orcid_import 2023-11-30 15:27:46 +01:00
Sandro La Bruzzo 6ce36b3e41 Implemented ORCID Workflow on DHP-Aggregation for retrieving ORCID DUMP and generating tables 2023-11-14 12:04:29 +01:00
Serafeim Chatzopoulos a82aaf57b2 Renaming input param for crossref input path 2023-10-25 12:05:02 -07:00
Serafeim Chatzopoulos aad5982bf1 Change the description of the workflow 2023-10-20 12:48:21 +03:00
Serafeim Chatzopoulos 6b19dcee80 Add actionset creation for pubmed affiliations 2023-10-19 19:58:25 +03:00
Miriam Baglioni a431b04814 leftover for the properties and removal of bipfinder 2023-10-10 12:53:57 +02:00
Miriam Baglioni 110ce4b40f extend the fos model to include the level4 and the scores for level3 and level4. removed bip indicators from the instance 2023-10-10 09:46:40 +02:00
Claudio Atzori 84a58802ab [OC] using the common pid cleaning function 2023-10-06 14:48:05 +02:00
Claudio Atzori 46034630cf [OC] compress the output actionset 2023-10-06 14:42:02 +02:00
Claudio Atzori ee8a39e7d2 cleanup and refinements 2023-10-04 12:32:05 +02:00
Miriam Baglioni 9898470b0e Addressing comments in #340\#issuecomment-10592 2023-10-02 12:54:16 +02:00
Miriam Baglioni e84f5b5e64 extended existing codo to accomodate import of POCI from open citation 2023-10-02 09:25:16 +02:00
Claudio Atzori 15666e86a8 added collectedfrom to the affiliation relations imported from Crossref 2023-09-04 15:56:06 +02:00
Serafeim Chatzopoulos 7de0164c26 Fix import of affiliations relations from Crossref 2023-09-04 16:04:41 +03:00
Miriam Baglioni 9c8b41475a Merge pull request '8172_impact_indicators_workflow' (#284) from 8172_impact_indicators_workflow into beta
Reviewed-on: #284
2023-08-14 15:50:48 +02:00
Serafeim Chatzopoulos 97c1ba8918 Merge actionsets of results and projects 2023-08-11 15:56:53 +03:00
Serafeim Chatzopoulos 7cefe2665b Remove unnecessary classes 2023-07-28 19:14:39 +03:00
Serafeim Chatzopoulos 26a92ce762 Merge branch '8876' of https://code-repo.d4science.org/D-Net/dnet-hadoop into 8876 2023-07-28 19:03:57 +03:00
Serafeim Chatzopoulos ebfba38ab6 Add changes from code review 2023-07-28 19:03:47 +03:00
Serafeim Chatzopoulos eb8684a8cf Merge branch 'beta' into 8876 2023-07-28 13:39:33 +02:00
Giambattista Bloisi e64c2854a3 Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface
JsonPath cache contention fixed by using a ConcurrentHashMap
Blacklist filtering performance improvement
Minor performance improvements when evaluating similarity
Sorting in clustered elements is deterministic (by ordering and identity field, instead of ordering field only)
2023-07-24 15:36:24 +02:00
Serafeim Chatzopoulos 2cc5b1a39b Fixes in workflow.xml 2023-07-21 15:26:50 +03:00
Serafeim Chatzopoulos be320ba3c1 Indentation fixes 2023-07-17 16:04:21 +03:00