Michele Artini
9506d80ddc
changed sql to select accepted datasources
2024-03-04 08:25:40 +01:00
Michele Artini
c2b6841eb0
opendoar datasource filter
2024-03-01 15:32:56 +01:00
Michele Artini
be7f327e88
opendoar datasource filter
2024-03-01 13:38:36 +01:00
Michele Artini
32f4d6f691
reports for types
2024-03-01 11:43:37 +01:00
Michele Artini
71204a8056
some fields in stats
2024-02-29 10:17:31 +01:00
Michele Artini
5ddbef3a5b
new stats
2024-02-28 14:34:09 +01:00
Michele Artini
3d14bef381
OpenDoar reports
2024-02-28 10:51:13 +01:00
Michele Artini
f8cf7ffbcb
stats
2024-02-22 14:01:11 +01:00
Michele Artini
d2b7541583
fixed a problem with Dataset model
2024-02-16 11:36:46 +01:00
Michele Artini
8ffdd9747d
added id to BaseRecordInfo
2024-02-15 14:27:50 +01:00
Michele Artini
da65728afe
produce a parquet file
2024-02-15 14:04:17 +01:00
Michele Artini
e254720377
fixed path reports
2024-02-15 08:52:28 +01:00
Michele Artini
8d85c1e97e
used a parser STAX
2024-02-15 08:21:52 +01:00
Michele Artini
b42e2b4d61
fixed log class
2024-02-14 15:52:31 +01:00
Michele Artini
773346f638
increased memory
2024-02-14 14:40:27 +01:00
Michele Artini
2e11197142
removed invalid deletion
2024-02-14 11:59:30 +01:00
Michele Artini
ddd6a7ceb3
minor fixes
2024-02-14 11:39:37 +01:00
Michele Artini
963a2500be
new reports in hadoop job
2024-02-14 10:37:39 +01:00
Michele Artini
4b1ecad4e2
prepared a job to analyze the BASE records
2024-02-13 13:48:26 +01:00
Michele Artini
dd7350ecf2
fixed a problem with xpaths
2024-02-13 08:36:42 +01:00
Michele Artini
265bfd364d
refactoing
2024-02-12 15:35:36 +01:00
Michele Artini
16766c514e
refactoring
2024-02-12 12:19:57 +01:00
Michele Artini
5add433b74
partial refactoring
2024-02-09 14:33:04 +01:00
Michele Artini
c974c75f83
partial refactoring
2024-02-09 12:36:20 +01:00
Michele Artini
c6db6335b9
prepare filtering for base import
2024-02-06 15:10:29 +01:00
Michele Artini
abcd81bba0
first implementation of the collection plugin for BASE
2024-02-05 15:19:41 +01:00
Miriam Baglioni
f612125939
fix issue on FoS integration. Removing the null values from FoS
2024-01-12 10:20:28 +01:00
Sandro La Bruzzo
5e22b67b8a
Merge remote-tracking branch 'origin/beta' into orcid_import
2023-11-30 15:27:46 +01:00
Sandro La Bruzzo
6ce36b3e41
Implemented ORCID Workflow on DHP-Aggregation for retrieving ORCID DUMP and generating tables
2023-11-14 12:04:29 +01:00
Serafeim Chatzopoulos
a82aaf57b2
Renaming input param for crossref input path
2023-10-25 12:05:02 -07:00
Serafeim Chatzopoulos
aad5982bf1
Change the description of the workflow
2023-10-20 12:48:21 +03:00
Serafeim Chatzopoulos
6b19dcee80
Add actionset creation for pubmed affiliations
2023-10-19 19:58:25 +03:00
Miriam Baglioni
a431b04814
leftover for the properties and removal of bipfinder
2023-10-10 12:53:57 +02:00
Miriam Baglioni
110ce4b40f
extend the fos model to include the level4 and the scores for level3 and level4. removed bip indicators from the instance
2023-10-10 09:46:40 +02:00
Claudio Atzori
84a58802ab
[OC] using the common pid cleaning function
2023-10-06 14:48:05 +02:00
Claudio Atzori
46034630cf
[OC] compress the output actionset
2023-10-06 14:42:02 +02:00
Claudio Atzori
ee8a39e7d2
cleanup and refinements
2023-10-04 12:32:05 +02:00
Miriam Baglioni
9898470b0e
Addressing comments in #340 \#issuecomment-10592
2023-10-02 12:54:16 +02:00
Miriam Baglioni
e84f5b5e64
extended existing codo to accomodate import of POCI from open citation
2023-10-02 09:25:16 +02:00
Claudio Atzori
15666e86a8
added collectedfrom to the affiliation relations imported from Crossref
2023-09-04 15:56:06 +02:00
Serafeim Chatzopoulos
7de0164c26
Fix import of affiliations relations from Crossref
2023-09-04 16:04:41 +03:00
Miriam Baglioni
9c8b41475a
Merge pull request '8172_impact_indicators_workflow' ( #284 ) from 8172_impact_indicators_workflow into beta
...
Reviewed-on: #284
2023-08-14 15:50:48 +02:00
Serafeim Chatzopoulos
97c1ba8918
Merge actionsets of results and projects
2023-08-11 15:56:53 +03:00
Serafeim Chatzopoulos
7cefe2665b
Remove unnecessary classes
2023-07-28 19:14:39 +03:00
Serafeim Chatzopoulos
26a92ce762
Merge branch '8876' of https://code-repo.d4science.org/D-Net/dnet-hadoop into 8876
2023-07-28 19:03:57 +03:00
Serafeim Chatzopoulos
ebfba38ab6
Add changes from code review
2023-07-28 19:03:47 +03:00
Serafeim Chatzopoulos
eb8684a8cf
Merge branch 'beta' into 8876
2023-07-28 13:39:33 +02:00
Giambattista Bloisi
e64c2854a3
Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface
...
JsonPath cache contention fixed by using a ConcurrentHashMap
Blacklist filtering performance improvement
Minor performance improvements when evaluating similarity
Sorting in clustered elements is deterministic (by ordering and identity field, instead of ordering field only)
2023-07-24 15:36:24 +02:00
Serafeim Chatzopoulos
2cc5b1a39b
Fixes in workflow.xml
2023-07-21 15:26:50 +03:00
Serafeim Chatzopoulos
be320ba3c1
Indentation fixes
2023-07-17 16:04:21 +03:00