Claudio Atzori
|
c2f179800c
|
Merge pull request 'Run CC and RAM sequentieally in dhp-impact-indicators WF' (#338) from run_cc_and_ram_sequentially into master
Reviewed-on: D-Net/dnet-hadoop#338
|
2023-09-13 08:52:53 +02:00 |
Serafeim Chatzopoulos
|
2aed5a74be
|
Run CC and RAM sequentieally in dhp-impact-indicators WF
|
2023-09-12 22:31:50 +03:00 |
Claudio Atzori
|
4dc4862011
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2023-09-12 14:34:34 +02:00 |
Claudio Atzori
|
dc80ab14d3
|
[graph dedup] consistency wf should not remove the relations while dispatching the entities
|
2023-09-12 14:34:28 +02:00 |
Alessia Bardi
|
77a2199837
|
updated test for EOSC comunity
|
2023-09-08 11:05:49 +02:00 |
Claudio Atzori
|
265180bfd2
|
added Archive ouverte UNIGE (ETHZ.UNIGENF, opendoar____::1400) to the Datacite hostedBy_map
|
2023-09-07 11:20:35 +02:00 |
Claudio Atzori
|
da0e9828f7
|
resolved conflicts for PR#337
|
2023-09-06 11:28:46 +02:00 |
Claudio Atzori
|
9f5d16624c
|
Merge pull request '[graph raw] datainfo.invisible set as true only for entities' (#336) from invisible_relations into beta
Reviewed-on: D-Net/dnet-hadoop#336
|
2023-09-04 16:14:47 +02:00 |
Claudio Atzori
|
adec6692ca
|
Merge branch 'beta' into invisible_relations
|
2023-09-04 16:13:06 +02:00 |
Claudio Atzori
|
15666e86a8
|
added collectedfrom to the affiliation relations imported from Crossref
|
2023-09-04 15:56:06 +02:00 |
Claudio Atzori
|
7d6bd4f20b
|
Merge pull request 'Fix import of affiliations relations from Crossref' (#335) from 8876_fix_crossref_affiliation_relations_import into beta
Reviewed-on: D-Net/dnet-hadoop#335
|
2023-09-04 15:19:58 +02:00 |
Claudio Atzori
|
5b06c9d06f
|
[graph raw] datainfo.invisible set as true only for entities
|
2023-09-04 15:15:24 +02:00 |
Serafeim Chatzopoulos
|
7de0164c26
|
Fix import of affiliations relations from Crossref
|
2023-09-04 16:04:41 +03:00 |
Claudio Atzori
|
488d9a1cea
|
Merge pull request 'Add sparkExecutorMemoryOverhead workflow config to set off-heap memory for Spark actions. If not explicitly set it is defaulted to 1Gb' (#331) from consistencywf_memoryoverhead_conf into beta
Reviewed-on: D-Net/dnet-hadoop#331
|
2023-08-29 16:31:36 +02:00 |
Giambattista Bloisi
|
6b1c05d118
|
Add sparkExecutorMemoryOverhead workflow config to set off-heap memory for Spark actions. If not explicitly set it is defaulted to 1Gb
|
2023-08-29 16:04:19 +02:00 |
Claudio Atzori
|
bf35280ea6
|
code formatting
|
2023-08-29 11:11:00 +02:00 |
Claudio Atzori
|
0515d81c7c
|
Merge pull request 'Rewrite SparkPropagateRelation exploiting Dataframe API' (#330) from propagate_relation_rewrite into beta
Reviewed-on: D-Net/dnet-hadoop#330
|
2023-08-29 10:47:14 +02:00 |
Claudio Atzori
|
58665a246c
|
Merge branch 'beta' into propagate_relation_rewrite
|
2023-08-29 10:47:02 +02:00 |
Claudio Atzori
|
f437be80ad
|
[impact indicators] adjusted paths in the bip ranker wf parameters
|
2023-08-29 09:03:03 +02:00 |
Giambattista Bloisi
|
d012aec0b3
|
Revert PropagateRelation's argument name from outputPath to graphOutputPath in consistency workflow (#8964)
|
2023-08-28 22:44:54 +02:00 |
Giambattista Bloisi
|
a860e19423
|
Fix ensure all relations are written out, not only those managed by dedup
|
2023-08-28 15:36:02 +02:00 |
Giambattista Bloisi
|
0d7b2bf83d
|
Rewrite SparkPropagateRelation exploiting Dataframe API
|
2023-08-28 10:34:54 +02:00 |
Miriam Baglioni
|
9c8b41475a
|
Merge pull request '8172_impact_indicators_workflow' (#284) from 8172_impact_indicators_workflow into beta
Reviewed-on: D-Net/dnet-hadoop#284
|
2023-08-14 15:50:48 +02:00 |
Serafeim Chatzopoulos
|
97c1ba8918
|
Merge actionsets of results and projects
|
2023-08-11 15:56:53 +03:00 |
Miriam Baglioni
|
35b8deb2c6
|
Merge pull request 'DispatchEntitiesSparkJob: manage all entity types together, support filtering by dataInfo.invisible flag' (#329) from dispatch_filter_invisible_entities into beta
Reviewed-on: D-Net/dnet-hadoop#329
|
2023-08-10 12:56:18 +02:00 |
Giambattista Bloisi
|
95cd2b9b1e
|
Make filterInvisible a mandatory parameter of DispathEntitiesSparkJob
Make filterInvisible a mandatory parameter of both dedup/consistency and graph/group oozie workflows
|
2023-08-10 11:53:48 +02:00 |
Giambattista Bloisi
|
fab9920271
|
DispatchEntitiesSparkJob: manage all entity types together, support filtering by dataInfo.invisible flag
|
2023-08-09 15:41:43 +02:00 |
Miriam Baglioni
|
c25ac21e5e
|
Merge pull request 'graph cleaning, suggestions from ticket 8898' (#325) from cleaning_8898 into beta
Reviewed-on: D-Net/dnet-hadoop#325
|
2023-08-08 11:14:19 +02:00 |
Miriam Baglioni
|
c334fe2438
|
Merge pull request 'Add a "CleanRelation" action after the PropagateRelation to filter out all relations that have been deleted by inference or that are pointing to dangling entities' (#328) from cleanup_relations_after_dedup into beta
Reviewed-on: D-Net/dnet-hadoop#328
|
2023-08-08 09:49:12 +02:00 |
Miriam Baglioni
|
0e2f855807
|
Merge pull request 'Updates Promotion DBs' (#321) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#321
|
2023-08-07 12:09:16 +02:00 |
Miriam Baglioni
|
18fbe52b20
|
Merge pull request 'Import affiliation relations from Crossref' (#320) from 8876 into beta
Reviewed-on: D-Net/dnet-hadoop#320
|
2023-08-07 10:45:30 +02:00 |
Giambattista Bloisi
|
97b6d1dc45
|
Filter ids by dataInfo.deletedbyinference and DataInfo.invisible flags
Filter relations also by dataInfo.invisible flag
|
2023-08-07 10:24:11 +02:00 |
Giambattista Bloisi
|
af49424b59
|
Add a "CleanRelation" action after the PropagateRelation to filter out all relations that have been deleyted by inference or that are pointing to dangling entities
|
2023-08-04 14:27:39 +02:00 |
Claudio Atzori
|
0bc74e2000
|
code formatting
|
2023-08-02 11:52:10 +02:00 |
Claudio Atzori
|
7180911ded
|
[graph cleaning] fixed regex behaviour for cleaning ROR and GRID identifiers, added tests
|
2023-08-02 11:44:14 +02:00 |
Claudio Atzori
|
b9dddbfe54
|
rule out records with NULL dataInfo, except for Relations
|
2023-07-31 17:53:54 +02:00 |
Claudio Atzori
|
da1727f93f
|
rule out records with NULL dataInfo, except for Relations
|
2023-07-31 17:52:56 +02:00 |
Claudio Atzori
|
11ffb9bd68
|
rule out records with NULL dataInfo
|
2023-07-31 12:35:33 +02:00 |
Claudio Atzori
|
ccac6a7f75
|
rule out records with NULL dataInfo
|
2023-07-31 12:35:05 +02:00 |
Serafeim Chatzopoulos
|
7cefe2665b
|
Remove unnecessary classes
|
2023-07-28 19:14:39 +03:00 |
Serafeim Chatzopoulos
|
26a92ce762
|
Merge branch '8876' of https://code-repo.d4science.org/D-Net/dnet-hadoop into 8876
|
2023-07-28 19:03:57 +03:00 |
Serafeim Chatzopoulos
|
ebfba38ab6
|
Add changes from code review
|
2023-07-28 19:03:47 +03:00 |
Serafeim Chatzopoulos
|
eb8684a8cf
|
Merge branch 'beta' into 8876
|
2023-07-28 13:39:33 +02:00 |
Claudio Atzori
|
1275a07d45
|
Merge pull request '[graph indexing] expand the instance level fulltext in the XML records' (#326) from instance_fulltext_xml into beta
Reviewed-on: D-Net/dnet-hadoop#326
|
2023-07-27 15:02:07 +02:00 |
Claudio Atzori
|
a72b9e96ac
|
expand the instance level fulltext in the XML records
|
2023-07-27 14:57:38 +02:00 |
Claudio Atzori
|
d512df8612
|
code formatting
|
2023-07-26 09:14:08 +02:00 |
Claudio Atzori
|
d8435a6512
|
inverted condition
|
2023-07-25 17:39:57 +02:00 |
Claudio Atzori
|
59764145bb
|
cherry picked & fixed commit 270df939c4
|
2023-07-25 17:39:00 +02:00 |
Claudio Atzori
|
270df939c4
|
partial implementation of the suggestions from https://support.openaire.eu/issues/8898
|
2023-07-25 17:29:50 +02:00 |
Claudio Atzori
|
8c63e4a864
|
Merge pull request 'Refactor Dedup using Spark Dataframe API, initial support for scala 2.12 and Spark 3.4' (#324) from dedup-with-dataframe-2 into beta
Reviewed-on: D-Net/dnet-hadoop#324
|
2023-07-25 10:17:17 +02:00 |