Commit Graph

  • f759b18bca [SWH] aligned parameter name Claudio Atzori 2023-10-06 13:43:20 +0200
  • 2c235e82ad Fix cleaning of Pmid where parsing of numbers stopped at first not leading 0' character Giambattista Bloisi 2023-10-06 12:35:54 +0200
  • eed9fe0902 code formatting Claudio Atzori 2023-10-06 12:31:17 +0200
  • 7f27111b1f Merge branch 'importpoci' into beta Claudio Atzori 2023-10-06 12:23:28 +0200
  • 73c49b8d26 Merge branch 'beta' into SWH_integration Claudio Atzori 2023-10-06 12:21:51 +0200
  • 42a2dad975 implemented relation to irish funder from a Json list Sandro La Bruzzo 2023-10-06 11:52:33 +0200
  • 13f332ce77 ignored jenv prop Sandro La Bruzzo 2023-10-06 10:40:05 +0200
  • 1bb83b9188 Add prefix in SWH ID Serafeim Chatzopoulos 2023-10-04 20:31:45 +0300
  • ee8a39e7d2 cleanup and refinements Claudio Atzori 2023-10-04 12:32:05 +0200
  • e9f24df21c Move SWH API Key from constants to workflow param Serafeim Chatzopoulos 2023-10-03 20:57:57 +0300
  • cae75fc75d Add SWH in the collectedFrom field Serafeim Chatzopoulos 2023-10-03 16:55:10 +0300
  • b49a3ac9b2 Add actionsetsPath as a global WF param Serafeim Chatzopoulos 2023-10-03 15:43:38 +0300
  • 24c43e0c60 Restructure workflow parameters Serafeim Chatzopoulos 2023-10-03 15:11:58 +0300
  • 9f73d93e62 Add param for limiting repo Urls Serafeim Chatzopoulos 2023-10-03 14:39:08 +0300
  • b446a9ed98 Merge branch 'beta' into peer_reviewed Claudio Atzori 2023-10-03 10:52:23 +0200
  • f344ad76d0 Merge pull request 'extended existing code to import of POCI from open citation' (#340) from importpoci into beta Claudio Atzori 2023-10-03 10:52:11 +0200
  • 5919e488dd Merge branch 'beta' into importpoci Claudio Atzori 2023-10-03 10:43:53 +0200
  • 839a8524e7 Add action for creating actionsets Serafeim Chatzopoulos 2023-10-02 23:50:38 +0300
  • c9a5ad6a02 extending the coverage of the peer non-unknown refereed instances Claudio Atzori 2023-10-02 16:28:42 +0200
  • d7fccdc64b fixed paths in wf to match the req of the pathname Miriam Baglioni 2023-10-02 14:10:57 +0200
  • 9898470b0e Addressing comments in D-Net/dnet-hadoop#340\#issuecomment-10592 Miriam Baglioni 2023-10-02 12:54:16 +0200
  • c412dc162b Fix bug in conversion from dedup json model to Spark Dataset of Rows: list of strings contained the json escaped representation of the value instead of the plain value, this caused instanceTypeMatch failures because of the leading and trailing double quotes Giambattista Bloisi 2023-09-27 22:30:47 +0200
  • 4ac06c9e37 Merge pull request 'Fix bug in conversion from dedup json model to Spark Dataset of Rows (instanceTypeMatch no longer working)' (#339) from fix_dedupfailsonmatchinginstances into master Claudio Atzori 2023-10-02 11:34:20 +0200
  • fa692b3629 Merge branch 'master' into fix_dedupfailsonmatchinginstances Claudio Atzori 2023-10-02 11:28:16 +0200
  • 5d09b7db8b Merge pull request 'SparkPropagateRelation relations do not propagate deletedByInference and invisible' (#333) from consistency_keep_mergerels into beta Claudio Atzori 2023-10-02 11:27:57 +0200
  • 7b403a920f Merge branch 'beta' into consistency_keep_mergerels Claudio Atzori 2023-10-02 11:26:00 +0200
  • dc86018a5f Merge branch 'merge_entities_job' into beta Claudio Atzori 2023-10-02 11:24:48 +0200
  • 3c47920c78 Use asScala to convert java List to Scala Sequence Giambattista Bloisi 2023-09-20 16:14:01 +0200
  • 7f244d9a7a code formatting Claudio Atzori 2023-09-20 15:53:21 +0200
  • e239b81740 Fix defect #8997: GenerateEventsJob is generating huge amounts of logs because broker entity similarity calculation consistently failed Giambattista Bloisi 2023-09-20 15:42:00 +0200
  • ef02648399 Merge pull request 'fixed dedup configuration management in the Broker workflow' (#341) from fix_8997 into master Claudio Atzori 2023-10-02 11:03:50 +0200
  • d13bb534f0 Merge branch 'master' into fix_8997 Claudio Atzori 2023-10-02 11:03:18 +0200
  • e84f5b5e64 extended existing codo to accomodate import of POCI from open citation Miriam Baglioni 2023-10-02 09:25:16 +0200
  • ab0d70691c Add step for archiving repoUrls to SWH Serafeim Chatzopoulos 2023-09-28 20:56:18 +0300
  • 775c3f704a Fix bug in conversion from dedup json model to Spark Dataset of Rows: list of strings contained the json escaped representation of the value instead of the plain value, this caused instanceTypeMatch failures because of the leading and trailing double quotes Giambattista Bloisi 2023-09-27 22:30:47 +0200
  • ed9c81a0b7 Add steps to collect last visit data && archive not found repository URLs Serafeim Chatzopoulos 2023-09-27 19:00:54 +0300
  • 9c3ab11d5b Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop Sandro La Bruzzo 2023-09-25 15:29:19 +0200
  • 423ef30676 minor fix on the aggregation of uniprot and pdb Sandro La Bruzzo 2023-09-25 15:28:58 +0200
  • 7152d47f84 Use asScala to convert java List to Scala Sequence Giambattista Bloisi 2023-09-20 16:14:01 +0200
  • 4853c19b5e code formatting Claudio Atzori 2023-09-20 15:53:21 +0200
  • 1f226d1dce Fix defect #8997: GenerateEventsJob is generating huge amounts of logs because broker entity similarity calculation consistently failed Giambattista Bloisi 2023-09-20 15:42:00 +0200
  • 0935d7757c Use v5 of the UNIBI Gold ISSN list in test Alessia Bardi 2023-09-19 14:47:01 +0200
  • cc7204a089 tests for d4science catalog Alessia Bardi 2023-09-19 13:38:25 +0200
  • 76476cdfb6 Added maven repo for dependencies that are not in maven central Sandro La Bruzzo 2023-09-20 10:33:14 +0200
  • 6186cdc2cc Use v5 of the UNIBI Gold ISSN list in test Alessia Bardi 2023-09-19 14:47:01 +0200
  • d94b9bebf7 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop Alessia Bardi 2023-09-19 13:38:45 +0200
  • 19abba8fa7 tests for d4science catalog Alessia Bardi 2023-09-19 13:38:25 +0200
  • 9ef971a146 Update step16-createIndicatorsTables.sql dimitrispie 2023-09-19 14:25:42 +0300
  • 9d44418d38 Add collecting software code repository URLs Serafeim Chatzopoulos 2023-09-14 18:43:25 +0300
  • 395a4af020 Run CC and RAM sequentieally in dhp-impact-indicators WF Serafeim Chatzopoulos 2023-09-12 22:31:50 +0300
  • c2f179800c Merge pull request 'Run CC and RAM sequentieally in dhp-impact-indicators WF' (#338) from run_cc_and_ram_sequentially into master Claudio Atzori 2023-09-13 08:52:53 +0200
  • 2aed5a74be Run CC and RAM sequentieally in dhp-impact-indicators WF Serafeim Chatzopoulos 2023-09-12 22:31:50 +0300
  • 8a6892cc63 [graph dedup] consistency wf should not remove the relations while dispatching the entities Claudio Atzori 2023-09-12 14:34:28 +0200
  • 4dc4862011 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop Claudio Atzori 2023-09-12 14:34:34 +0200
  • dc80ab14d3 [graph dedup] consistency wf should not remove the relations while dispatching the entities Claudio Atzori 2023-09-12 14:34:28 +0200
  • 77a2199837 updated test for EOSC comunity Alessia Bardi 2023-09-08 11:05:49 +0200
  • 4786aa0e09 added Archive ouverte UNIGE (ETHZ.UNIGENF, opendoar____::1400) to the Datacite hostedBy_map Claudio Atzori 2023-09-07 11:20:35 +0200
  • 265180bfd2 added Archive ouverte UNIGE (ETHZ.UNIGENF, opendoar____::1400) to the Datacite hostedBy_map Claudio Atzori 2023-09-07 11:20:35 +0200
  • 5f90cc11e9 Update step16-createIndicatorsTables.sql dimitrispie 2023-09-06 14:14:38 +0300
  • da0e9828f7 resolved conflicts for PR#337 Claudio Atzori 2023-09-06 11:28:46 +0200
  • 9f5d16624c Merge pull request '[graph raw] datainfo.invisible set as true only for entities' (#336) from invisible_relations into beta Claudio Atzori 2023-09-04 16:14:47 +0200
  • adec6692ca Merge branch 'beta' into invisible_relations Claudio Atzori 2023-09-04 16:13:06 +0200
  • 15666e86a8 added collectedfrom to the affiliation relations imported from Crossref Claudio Atzori 2023-09-04 15:56:06 +0200
  • 7d6bd4f20b Merge pull request 'Fix import of affiliations relations from Crossref' (#335) from 8876_fix_crossref_affiliation_relations_import into beta Claudio Atzori 2023-09-04 15:19:58 +0200
  • 5b06c9d06f [graph raw] datainfo.invisible set as true only for entities Claudio Atzori 2023-09-04 15:15:24 +0200
  • 7de0164c26 Fix import of affiliations relations from Crossref Serafeim Chatzopoulos 2023-09-04 16:04:41 +0300
  • 2caaaec42d Include SparkCleanRelation logic in SparkPropagateRelation SparkPropagateRelation includes merge relations Revised tests for SparkPropagateRelation Giambattista Bloisi 2023-09-01 09:32:57 +0200
  • 964c2f553e Changes in indicators step, monitor step dimitrispie 2023-09-01 10:57:02 +0300
  • 6cc7d8ca7b GroupEntities and DispatchEntites are now merged in GroupEntitiesSparkJob Giambattista Bloisi 2023-08-24 21:48:07 +0200
  • 488d9a1cea Merge pull request 'Add sparkExecutorMemoryOverhead workflow config to set off-heap memory for Spark actions. If not explicitly set it is defaulted to 1Gb' (#331) from consistencywf_memoryoverhead_conf into beta Claudio Atzori 2023-08-29 16:31:36 +0200
  • 6b1c05d118 Add sparkExecutorMemoryOverhead workflow config to set off-heap memory for Spark actions. If not explicitly set it is defaulted to 1Gb Giambattista Bloisi 2023-08-29 16:04:19 +0200
  • bf35280ea6 code formatting Claudio Atzori 2023-08-29 11:11:00 +0200
  • 0515d81c7c Merge pull request 'Rewrite SparkPropagateRelation exploiting Dataframe API' (#330) from propagate_relation_rewrite into beta Claudio Atzori 2023-08-29 10:47:14 +0200
  • 58665a246c Merge branch 'beta' into propagate_relation_rewrite Claudio Atzori 2023-08-29 10:47:02 +0200
  • f437be80ad [impact indicators] adjusted paths in the bip ranker wf parameters Claudio Atzori 2023-08-29 09:03:03 +0200
  • d012aec0b3 Revert PropagateRelation's argument name from outputPath to graphOutputPath in consistency workflow (#8964) Giambattista Bloisi 2023-08-28 22:44:54 +0200
  • a860e19423 Fix ensure all relations are written out, not only those managed by dedup Giambattista Bloisi 2023-08-28 15:36:02 +0200
  • 0d7b2bf83d Rewrite SparkPropagateRelation exploiting Dataframe API Giambattista Bloisi 2023-08-28 10:34:54 +0200
  • 9c8b41475a Merge pull request '8172_impact_indicators_workflow' (#284) from 8172_impact_indicators_workflow into beta Miriam Baglioni 2023-08-14 15:50:48 +0200
  • 97c1ba8918 Merge actionsets of results and projects Serafeim Chatzopoulos 2023-08-11 15:56:53 +0300
  • 35b8deb2c6 Merge pull request 'DispatchEntitiesSparkJob: manage all entity types together, support filtering by dataInfo.invisible flag' (#329) from dispatch_filter_invisible_entities into beta Miriam Baglioni 2023-08-10 12:56:18 +0200
  • 95cd2b9b1e Make filterInvisible a mandatory parameter of DispathEntitiesSparkJob Make filterInvisible a mandatory parameter of both dedup/consistency and graph/group oozie workflows Giambattista Bloisi 2023-08-10 11:53:48 +0200
  • fab9920271 DispatchEntitiesSparkJob: manage all entity types together, support filtering by dataInfo.invisible flag Giambattista Bloisi 2023-08-08 15:52:20 +0200
  • 599828ce35 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop Miriam Baglioni 2023-08-09 13:07:13 +0200
  • c25ac21e5e Merge pull request 'graph cleaning, suggestions from ticket 8898' (#325) from cleaning_8898 into beta Miriam Baglioni 2023-08-08 11:14:19 +0200
  • c334fe2438 Merge pull request 'Add a "CleanRelation" action after the PropagateRelation to filter out all relations that have been deleted by inference or that are pointing to dangling entities' (#328) from cleanup_relations_after_dedup into beta Miriam Baglioni 2023-08-08 09:49:12 +0200
  • 0e2f855807 Merge pull request 'Updates Promotion DBs' (#321) from antonis.lempesis/dnet-hadoop:beta into beta Miriam Baglioni 2023-08-07 12:09:16 +0200
  • 18fbe52b20 Merge pull request 'Import affiliation relations from Crossref' (#320) from 8876 into beta Miriam Baglioni 2023-08-07 10:45:30 +0200
  • 97b6d1dc45 Filter ids by dataInfo.deletedbyinference and DataInfo.invisible flags Filter relations also by dataInfo.invisible flag Giambattista Bloisi 2023-08-07 10:24:11 +0200
  • af49424b59 Add a "CleanRelation" action after the PropagateRelation to filter out all relations that have been deleyted by inference or that are pointing to dangling entities Giambattista Bloisi 2023-08-04 14:27:39 +0200
  • 0bc74e2000 code formatting Claudio Atzori 2023-08-02 11:52:10 +0200
  • 7180911ded [graph cleaning] fixed regex behaviour for cleaning ROR and GRID identifiers, added tests Claudio Atzori 2023-06-23 16:10:49 +0200
  • b9dddbfe54 rule out records with NULL dataInfo, except for Relations Claudio Atzori 2023-07-31 17:52:56 +0200
  • da1727f93f rule out records with NULL dataInfo, except for Relations Claudio Atzori 2023-07-31 17:52:56 +0200
  • 11ffb9bd68 rule out records with NULL dataInfo Claudio Atzori 2023-07-31 12:35:05 +0200
  • ccac6a7f75 rule out records with NULL dataInfo Claudio Atzori 2023-07-31 12:35:05 +0200
  • 7cefe2665b Remove unnecessary classes Serafeim Chatzopoulos 2023-07-28 19:14:39 +0300
  • 26a92ce762 Merge branch '8876' of https://code-repo.d4science.org/D-Net/dnet-hadoop into 8876 Serafeim Chatzopoulos 2023-07-28 19:03:57 +0300
  • ebfba38ab6 Add changes from code review Serafeim Chatzopoulos 2023-07-28 19:03:47 +0300
  • eb8684a8cf Merge branch 'beta' into 8876 Serafeim Chatzopoulos 2023-07-28 13:39:33 +0200