Commit Graph

5479 Commits

Author SHA1 Message Date
Miriam Baglioni 9898470b0e Addressing comments in #340\#issuecomment-10592 2023-10-02 12:54:16 +02:00
Giambattista Bloisi c412dc162b Fix bug in conversion from dedup json model to Spark Dataset of Rows: list of strings contained the json escaped representation of the value instead of the plain value, this caused instanceTypeMatch failures because of the leading and trailing double quotes 2023-10-02 11:34:51 +02:00
Claudio Atzori 4ac06c9e37 Merge pull request 'Fix bug in conversion from dedup json model to Spark Dataset of Rows (instanceTypeMatch no longer working)' (#339) from fix_dedupfailsonmatchinginstances into master
Reviewed-on: #339
2023-10-02 11:34:20 +02:00
Claudio Atzori fa692b3629 Merge branch 'master' into fix_dedupfailsonmatchinginstances 2023-10-02 11:28:16 +02:00
Claudio Atzori 5d09b7db8b Merge pull request 'SparkPropagateRelation relations do not propagate deletedByInference and invisible' (#333) from consistency_keep_mergerels into beta
Reviewed-on: #333
2023-10-02 11:27:57 +02:00
Claudio Atzori 7b403a920f Merge branch 'beta' into consistency_keep_mergerels 2023-10-02 11:26:00 +02:00
Claudio Atzori dc86018a5f Merge branch 'merge_entities_job' into beta 2023-10-02 11:24:48 +02:00
Giambattista Bloisi 3c47920c78 Use asScala to convert java List to Scala Sequence 2023-10-02 11:04:47 +02:00
Claudio Atzori 7f244d9a7a code formatting 2023-10-02 11:04:36 +02:00
Giambattista Bloisi e239b81740 Fix defect #8997: GenerateEventsJob is generating huge amounts of logs because broker entity similarity calculation consistently failed 2023-10-02 11:04:18 +02:00
Claudio Atzori ef02648399 Merge pull request 'fixed dedup configuration management in the Broker workflow' (#341) from fix_8997 into master
Reviewed-on: #341
2023-10-02 11:03:50 +02:00
Claudio Atzori d13bb534f0 Merge branch 'master' into fix_8997 2023-10-02 11:03:18 +02:00
Miriam Baglioni e84f5b5e64 extended existing codo to accomodate import of POCI from open citation 2023-10-02 09:25:16 +02:00
Serafeim Chatzopoulos ab0d70691c Add step for archiving repoUrls to SWH 2023-09-28 20:56:18 +03:00
Giambattista Bloisi 775c3f704a Fix bug in conversion from dedup json model to Spark Dataset of Rows: list of strings contained the json escaped representation of the value instead of the plain value, this caused instanceTypeMatch failures because of the leading and trailing double quotes 2023-09-27 22:30:47 +02:00
Serafeim Chatzopoulos ed9c81a0b7 Add steps to collect last visit data && archive not found repository URLs 2023-09-27 19:00:54 +03:00
Sandro La Bruzzo 9c3ab11d5b Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2023-09-25 15:29:19 +02:00
Sandro La Bruzzo 423ef30676 minor fix on the aggregation of uniprot and pdb 2023-09-25 15:28:58 +02:00
Giambattista Bloisi 7152d47f84 Use asScala to convert java List to Scala Sequence 2023-09-20 16:14:27 +02:00
Claudio Atzori 4853c19b5e code formatting 2023-09-20 15:53:21 +02:00
Giambattista Bloisi 1f226d1dce Fix defect #8997: GenerateEventsJob is generating huge amounts of logs because broker entity similarity calculation consistently failed 2023-09-20 15:42:00 +02:00
Alessia Bardi 0935d7757c Use v5 of the UNIBI Gold ISSN list in test 2023-09-20 15:41:35 +02:00
Alessia Bardi cc7204a089 tests for d4science catalog 2023-09-20 15:38:32 +02:00
Sandro La Bruzzo 76476cdfb6 Added maven repo for dependencies that are not in maven central 2023-09-20 10:33:14 +02:00
Alessia Bardi 6186cdc2cc Use v5 of the UNIBI Gold ISSN list in test 2023-09-19 14:47:01 +02:00
Alessia Bardi d94b9bebf7 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2023-09-19 13:38:45 +02:00
Alessia Bardi 19abba8fa7 tests for d4science catalog 2023-09-19 13:38:25 +02:00
dimitrispie 9ef971a146 Update step16-createIndicatorsTables.sql
Fix int year for:
indi_org_openess_year
indi_org_fairness_year
indi_org_findable_year
2023-09-19 14:25:42 +03:00
Serafeim Chatzopoulos 9d44418d38 Add collecting software code repository URLs 2023-09-14 18:43:25 +03:00
Serafeim Chatzopoulos 395a4af020 Run CC and RAM sequentieally in dhp-impact-indicators WF 2023-09-13 08:59:40 +02:00
Claudio Atzori c2f179800c Merge pull request 'Run CC and RAM sequentieally in dhp-impact-indicators WF' (#338) from run_cc_and_ram_sequentially into master
Reviewed-on: #338
2023-09-13 08:52:53 +02:00
Serafeim Chatzopoulos 2aed5a74be Run CC and RAM sequentieally in dhp-impact-indicators WF 2023-09-12 22:31:50 +03:00
Claudio Atzori 8a6892cc63 [graph dedup] consistency wf should not remove the relations while dispatching the entities 2023-09-12 21:27:05 +02:00
Claudio Atzori 4dc4862011 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2023-09-12 14:34:34 +02:00
Claudio Atzori dc80ab14d3 [graph dedup] consistency wf should not remove the relations while dispatching the entities 2023-09-12 14:34:28 +02:00
Alessia Bardi 77a2199837 updated test for EOSC comunity 2023-09-08 11:05:49 +02:00
Claudio Atzori 4786aa0e09 added Archive ouverte UNIGE (ETHZ.UNIGENF, opendoar____::1400) to the Datacite hostedBy_map 2023-09-07 11:21:07 +02:00
Claudio Atzori 265180bfd2 added Archive ouverte UNIGE (ETHZ.UNIGENF, opendoar____::1400) to the Datacite hostedBy_map 2023-09-07 11:20:35 +02:00
dimitrispie 5f90cc11e9 Update step16-createIndicatorsTables.sql
Fix indi_pub_bronze_oa
2023-09-06 14:14:38 +03:00
Claudio Atzori da0e9828f7 resolved conflicts for PR#337 2023-09-06 11:28:46 +02:00
Claudio Atzori 9f5d16624c Merge pull request '[graph raw] datainfo.invisible set as true only for entities' (#336) from invisible_relations into beta
Reviewed-on: #336
2023-09-04 16:14:47 +02:00
Claudio Atzori adec6692ca Merge branch 'beta' into invisible_relations 2023-09-04 16:13:06 +02:00
Claudio Atzori 15666e86a8 added collectedfrom to the affiliation relations imported from Crossref 2023-09-04 15:56:06 +02:00
Claudio Atzori 7d6bd4f20b Merge pull request 'Fix import of affiliations relations from Crossref' (#335) from 8876_fix_crossref_affiliation_relations_import into beta
Reviewed-on: #335
2023-09-04 15:19:58 +02:00
Claudio Atzori 5b06c9d06f [graph raw] datainfo.invisible set as true only for entities 2023-09-04 15:15:24 +02:00
Serafeim Chatzopoulos 7de0164c26 Fix import of affiliations relations from Crossref 2023-09-04 16:04:41 +03:00
Giambattista Bloisi 2caaaec42d Include SparkCleanRelation logic in SparkPropagateRelation
SparkPropagateRelation includes merge relations
Revised tests for SparkPropagateRelation
2023-09-04 11:33:20 +02:00
dimitrispie 964c2f553e Changes in indicators step, monitor step
- graduatedoctorates for observatory
- result_apc_affiliations table
- new indicators
	indi_is_funder_plan_s
	indi_funder_fairness
	indi_ris_fairness
	indi_funder_openess
	indi_ris_openess
	indi_funder_findable
	indi_ris_findable
	indi_is_project_result_after
- cast year to int in composite indicators
- new institutions
     -- Universidade Católica Portuguesa
     -- Iscte - Instituto Universitário de Lisboa
     -- Munster Technological University
     -- Cardiff University
     -- Leibniz Institute of Ecological Urban and Regional Development
2023-09-01 10:57:02 +03:00
Giambattista Bloisi 6cc7d8ca7b GroupEntities and DispatchEntites are now merged in GroupEntitiesSparkJob 2023-08-30 10:43:31 +02:00
Claudio Atzori 488d9a1cea Merge pull request 'Add sparkExecutorMemoryOverhead workflow config to set off-heap memory for Spark actions. If not explicitly set it is defaulted to 1Gb' (#331) from consistencywf_memoryoverhead_conf into beta
Reviewed-on: #331
2023-08-29 16:31:36 +02:00