Commit Graph

3918 Commits

Author SHA1 Message Date
Claudio Atzori bc05b6168a [maven-release-plugin] rollback the release of dhp-1.2.4 2022-04-07 11:49:06 +02:00
Claudio Atzori 505420fd61 [maven-release-plugin] prepare for next development iteration 2022-04-07 11:34:06 +02:00
Claudio Atzori 66e718981e [maven-release-plugin] prepare release dhp-1.2.4 2022-04-07 11:34:02 +02:00
Serafeim Chatzopoulos e612489670 Add fileGZip collector plugin and respective test 2022-04-06 19:12:44 +03:00
Claudio Atzori 4190c9f6bc [graph raw] avoid NPEs importing datasource consent fields 2022-04-06 15:34:31 +02:00
Claudio Atzori 05fafa1408 [graph raw] avoid NPEs importing datasource consent fields 2022-04-06 15:23:50 +02:00
Antonis Lempesis c442c91f89 computing stats in each step 2022-04-06 12:40:02 +03:00
Claudio Atzori 8c457f1b2c conflicts resolved, merged from beta 2022-04-06 10:27:52 +02:00
Miriam Baglioni e77d104951 [OC] added / to workflow path 2022-04-05 15:07:11 +02:00
Miriam Baglioni 79336d46c5 [Clean Context] first naive implementation of a functionality to clean not wanted contextes from one result. This implementation simply verifies the main title of the results start with a given string 2022-04-04 15:52:31 +02:00
Claudio Atzori 873369af1c Merge pull request '[stats wf] added apcs in monitor db' (#207) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #207
2022-03-29 15:40:20 +02:00
Antonis Lempesis 7112806a73 views cannot be stored as parquet... 2022-03-29 16:37:29 +03:00
Antonis Lempesis fff0b3cc19 added apcs in monitor db 2022-03-29 14:15:31 +03:00
Claudio Atzori de85367695 Merge pull request '[stats wf] fix: views cannot be stored as parquet...' (#206) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #206
2022-03-29 12:51:02 +02:00
Antonis Lempesis ee24f3eb2c views cannot be stored as parquet... 2022-03-29 13:47:48 +03:00
Sandro La Bruzzo 1b11010169 minor fix 2022-03-29 10:59:14 +02:00
Claudio Atzori 0a0ae84c22 [graph raw] DOI based instance URLs on https 2022-03-29 10:52:58 +02:00
Claudio Atzori eca82e30c9 updated dhp-schema version 2022-03-29 09:46:49 +02:00
Claudio Atzori 9fa3dd78fe Merge pull request '[stats wf] various fixes, organization ids for inst. dashboard' (#205) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #205
2022-03-28 22:03:49 +02:00
Claudio Atzori 5d53ac95aa Merge pull request 'XML serialisation of instances with the same URLs - 2nd round' (#204) from instance_group_by_url into beta
Reviewed-on: #204
2022-03-28 09:24:00 +02:00
Claudio Atzori 96aa2a5d0d Merge branch 'beta' into instance_group_by_url 2022-03-28 09:23:52 +02:00
Claudio Atzori 395ac6ecec merged pom.xml from beta branch 2022-03-28 09:23:42 +02:00
Claudio Atzori fa3cb84f77 Merge pull request 'Datasource consent fields' (#202) from datasource_pdf_consent into beta
Reviewed-on: #202
2022-03-28 09:21:14 +02:00
Claudio Atzori 741bc99c47 Merge branch 'beta' into datasource_pdf_consent 2022-03-28 09:20:48 +02:00
Claudio Atzori 3610f1749a merged pom.xml from beta branch 2022-03-28 09:20:27 +02:00
Claudio Atzori 61319b2e83 updated dhp-schema version; set entity-level dataInfo before & after merging the fields from the group of duplicates 2022-03-25 16:38:33 +01:00
Antonis Lempesis d8503cd191 added moooar organizations 2022-03-24 14:02:36 +02:00
Miriam Baglioni 7b8f85692e [Enrichment country] fixed issues with parameters and workflow args 2022-03-23 17:20:23 +01:00
Claudio Atzori 48d32466e4 instances grouped by URL expose only one refereed 2022-03-23 14:52:03 +01:00
Claudio Atzori f10066547b increased spark.sql.shuffle.partitions in affiliation_from_semrel_propagation 2022-03-23 12:22:26 +01:00
Claudio Atzori 43733c1a18 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-03-23 12:14:27 +01:00
Antonis Lempesis 62f91b0869 cleanup 2022-03-22 16:17:49 +02:00
Antonis Lempesis 2e8394ecf8 creating aaall tables as parquet 2022-03-22 16:16:08 +02:00
Antonis Lempesis dcfbeb8142 yet more typos 2022-03-21 12:36:03 +02:00
Miriam Baglioni 89fd275480 [HostedByMap] added left over from PR and fixed issue on workflow 2022-03-21 09:54:45 +01:00
miconis c763aded70 dependency updated to the new pace-core version 2022-03-16 16:41:50 +01:00
miconis c959639bd5 dependency updated to the new pace-core version 2022-03-15 16:33:03 +01:00
Miriam Baglioni 0f7d8ca2e0 [HostedByMap] change on master to align to PR 201 on beta merged as 9f3036c847 2022-03-11 15:16:02 +01:00
Claudio Atzori f430029596 cleanup 2022-03-11 14:28:28 +01:00
Claudio Atzori d48ccfd65e Merge pull request 'enrichment_country' (#203) from enrichment_country into beta
Looks good to me

Reviewed-on: #203
2022-03-11 14:27:01 +01:00
Miriam Baglioni 12de9acb0d [Country Propagation] left out from previous commit 2022-03-11 14:17:02 +01:00
Miriam Baglioni 2fbb35ade5 mergin with branch beta 2022-03-11 13:58:10 +01:00
Miriam Baglioni 4437f9345d [Country Propagation] left out from previous commit 2022-03-11 13:57:47 +01:00
Miriam Baglioni 2b643059fa [Country Propagation] changed the logic to get the collectedfrom at the result level. To fix issue when no instance is created for a result that should have the country associated. Change the code to use spark instead of hive to prepare the data needed for the propagation step. Added new tests for the intermediate steps and new verification for the propagation itself 2022-03-11 13:56:48 +01:00
Claudio Atzori f25407bbe2 added mapping for datasource consent fields to integrate them in the graph 2022-03-11 09:32:42 +01:00
Claudio Atzori 9f3036c847 Merge pull request 'HostedByMap' (#201) from hostedByMap_update into beta
Reviewed-on: #201
2022-03-04 16:26:27 +01:00
Miriam Baglioni 2c5087d55a [HostedByMap] download of doaj from json, modification of test resources, deletion of class no more needed for the CSV download 2022-03-04 15:18:21 +01:00
Miriam Baglioni 5d608d6291 [HostedByMap] changed the model to include also oaStart date and review process that could be possibly used in the future 2022-03-04 11:06:09 +01:00
Miriam Baglioni b7c2340952 [HostedByMap - DOIBoost] changed to use code moved to common since used also from hostedbymap now 2022-03-04 11:05:23 +01:00
Miriam Baglioni 8a41f63348 [HostedByMap] update to download the json instead of the csv 2022-03-04 10:38:43 +01:00