Commit Graph

  • bcc0a13981 [enrichment single step] adding <end> element in wf definition Miriam Baglioni 2024-01-18 17:39:14 +0100
  • 6af536541d [enrichment single step] moving parameter file in correct location Miriam Baglioni 2024-01-18 15:35:40 +0100
  • a12a3eb143 - Miriam Baglioni 2024-01-18 15:18:10 +0100
  • 628fdfb5eb Merge pull request '[enrichment single step]' (#378) from enrichmentSingleStepFixed into beta Claudio Atzori 2024-01-18 09:41:09 +0100
  • 82e9e262ee [enrichment single step] remove parameter from execution Miriam Baglioni 2024-01-17 17:38:03 +0100
  • 22eaf211e8 Last commit usage-stats-export-wf-v2 dimitrispie 2024-01-17 18:02:33 +0200
  • 67ce2d54be [enrichment single step] refactoring to fix issues in disappeared result type Miriam Baglioni 2024-01-17 16:50:00 +0100
  • 59eaccbd87 [enrichment single step] refactoring to fix issue in disappeared result type Miriam Baglioni 2024-01-15 17:49:54 +0100
  • 21a14fcd80 Reusable RunSQLSparkJob for executing SQL in Spark through Oozie Spark Actions Implements pivots table update oozie workflow Giambattista Bloisi 2024-01-15 00:08:07 +0100
  • e0753f19da Fixed error of connection timeout Sandro La Bruzzo 2024-01-13 09:27:08 +0100
  • e328bc0ade fixed missing parameter on download update sandro.labruzzo 2024-01-12 16:18:20 +0100
  • 2d302e6827 Merge pull request '[FoS integration]fix issue on FoS integration. Removing the null values from FoS' (#375) from fosPreparationBeta into beta Claudio Atzori 2024-01-12 10:27:28 +0100
  • f612125939 fix issue on FoS integration. Removing the null values from FoS Miriam Baglioni 2024-01-12 10:20:28 +0100
  • c67467723b Merge pull request 'refined mapping for the extraction of the original resource type' (#374) from resource_types into beta Claudio Atzori 2024-01-11 16:29:47 +0100
  • cb9e739484 Merge branch 'beta' into resource_types Claudio Atzori 2024-01-11 16:29:41 +0100
  • 2753044d13 refined mapping for the extraction of the original resource type Claudio Atzori 2024-01-11 16:28:26 +0100
  • a88dce5bf3 Merge pull request 'Improvements and refactoring in Dedup' (#367) from dedup_increasenumofblocks into beta Giambattista Bloisi 2024-01-11 11:24:06 +0100
  • 3c66e3bd7b Create dedup record for "merged" pivots Do not create dedup records for group that have more than 20 different acceptance date Giambattista Bloisi 2023-12-22 09:57:30 +0100
  • 10e135db1e Use dedup_wf_002 in place of dedup_wf_001 to make explicit a different algorithm has been used to generate those kind of ids Giambattista Bloisi 2023-12-22 09:55:10 +0100
  • 831cc1fdde Generate "merged" dedup id relations also for records that are filtered out by the cut parameters Giambattista Bloisi 2023-12-14 11:51:02 +0100
  • 1287315ffb Do no longer use dedupId information from pivotHistory Database Giambattista Bloisi 2023-12-11 21:26:05 +0100
  • 02636e802c SparkCreateSimRels: - Create dedup blocks from the complete queue of records matching cluster key instead of truncating the results - Clean titles once before clustering and similarity comparisons - Added support for filtered fields in model - Added support for sorting List fields in model - Added new JSONListClustering and numAuthorsTitleSuffixPrefixChain clustering functions - Added new maxLengthMatch comparator function - Use reduced complexity Levenshtein with threshold in levensteinTitle - Use reduced complexity AuthorsMatch with threshold early-quit - Use incremental Connected Component to decrease comparisons in similarity match in BlockProcessor - Use new clusterings configuration in Dedup tests Giambattista Bloisi 2023-10-02 09:25:12 +0200
  • e024718f73 creating result_instances even when no pids exist for the instance Antonis Lempesis 2024-01-10 22:25:50 +0100
  • 859babf722 added some useful comment Sandro La Bruzzo 2024-01-10 19:51:13 +0100
  • 39ebb60b38 Merge remote-tracking branch 'origin/beta' into orcid_update Sandro La Bruzzo 2024-01-10 19:50:00 +0100
  • 9d5a7c3b22 code refactor Sandro La Bruzzo 2024-01-10 19:42:34 +0100
  • 8f61063201 Added workflow Sandro La Bruzzo 2024-01-10 19:42:22 +0100
  • 1a42a5c10d Implemented Download update of ORCID Sandro La Bruzzo 2024-01-10 18:03:20 +0100
  • 16d858fbf0 Merge pull request 'enrichmentSingleStep' (#373) from enrichmentSingleStep into beta Claudio Atzori 2024-01-10 16:58:49 +0100
  • e711a05229 fixed conflicts Miriam Baglioni 2024-01-10 11:03:42 +0100
  • 71d6f30711 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta Miriam Baglioni 2024-01-10 10:59:58 +0100
  • b920307bdd Changes to indicators dimitrispie 2024-01-09 00:47:09 +0200
  • 8b2cbb611e Changes to beta db names dimitrispie 2024-01-09 00:40:56 +0200
  • 2e4cab026c fixed the result_country definition Antonis Lempesis 2024-01-08 16:01:26 +0200
  • 6b823100ae Update buildIrishMonitorDB.sql dimitrispie 2024-01-07 22:54:39 +0200
  • 75bfde043c Historical Snapshots Workflow dimitrispie 2024-01-04 15:11:04 +0200
  • cb14470ba6 added properties file in the forlder for the workflow of result to organization from inst repo propagation. Changes the path in the classes implementing the propagation Miriam Baglioni 2023-12-22 14:50:05 +0100
  • 9f966b59d4 added properties file in the forlder for the workflow of result to community from semrel propagation. Changes the path in the classes implementing the propagation Miriam Baglioni 2023-12-22 14:11:47 +0100
  • 2f3b5a133d added properties file in the forlder for the workflow of result to community from organization propagation. Changes the path in the classes implementing the propagation Miriam Baglioni 2023-12-22 13:56:40 +0100
  • 2f7b9ad815 added properties file in the forlder for the workflow of project to result propagation. Changes the path in the classes implementing the propagation Miriam Baglioni 2023-12-22 11:46:15 +0100
  • f2352e8a78 changed in the classes the path for the property files for the propagation of community from project Miriam Baglioni 2023-12-22 11:43:34 +0100
  • 009730b3d1 added properties file in the forlder for the workflow of orcid propagation. Changes the path in the classes implementing the propagationchanged the path to the parameter file in the class for entitytoorganization propagation Miriam Baglioni 2023-12-22 11:42:09 +0100
  • 89f269c7f4 changed the path to the parameter file in the class for entitytoorganization propagation Miriam Baglioni 2023-12-22 11:37:50 +0100
  • b06aea0adf adding the bulkTag parameter file in the folder for the oozie workflow for bulkTagging. Changes the path in the class Miriam Baglioni 2023-12-22 11:35:37 +0100
  • 3afd4aa57b adjustments for country propagation Miriam Baglioni 2023-12-22 11:27:30 +0100
  • ffdd03d2f4 Monitor Irish Stats WF dimitrispie 2023-12-22 11:05:24 +0200
  • 40b98d8182 Changes to indicators and funders definition dimitrispie 2023-12-22 10:29:20 +0200
  • 62104790ae added metaresourcetype to the result hive DB view Claudio Atzori 2023-12-21 12:26:19 +0100
  • 106968adaa Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop Claudio Atzori 2023-12-21 12:26:29 +0100
  • a8a4db96f0 added metaresourcetype to the result hive DB view Claudio Atzori 2023-12-21 12:26:19 +0100
  • 5011c4d11a refactoring after compiletion Miriam Baglioni 2023-12-20 15:57:26 +0100
  • 4740c808f7 - Miriam Baglioni 2023-12-20 14:26:54 +0100
  • d410ea8a41 added needed parameter Miriam Baglioni 2023-12-19 12:15:01 +0100
  • 37e36baf76 updated workflow for generation of Scholix Datasource's to use mdstore transactions Sandro La Bruzzo 2023-12-18 16:05:35 +0100
  • 624f5f3f21 [Transformative Agreement] added check to verify the APC were paid byu the IReL funder Miriam Baglioni 2023-12-18 15:28:19 +0100
  • 354e02e6a9 [Transformative Agreement] removed not needed class. Read directly the json and no need to pass from the csv Miriam Baglioni 2023-12-18 15:20:27 +0100
  • b00771c7cc [Transformative Agreement] added code to extract relations from the transformative agreement file for the IE products got from OpenAPC Miriam Baglioni 2023-12-18 15:12:44 +0100
  • 9d39845d1f uploaded input parameters on CreateBaseline WF Sandro La Bruzzo 2023-12-18 12:23:12 +0100
  • 15fd93a2b6 uploaded input parameters on CreateBaseline WF Sandro La Bruzzo 2023-12-18 12:21:55 +0100
  • 9d342a47da updated the transformation Baseline workflow to include mdstore rollback/commit action Sandro La Bruzzo 2023-12-18 11:48:57 +0100
  • 1fbd4325f5 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop Sandro La Bruzzo 2023-12-18 11:47:17 +0100
  • 1f1a6a5f5f updated the transformation Baseline workflow to include mdstore rollback/commit action Sandro La Bruzzo 2023-12-18 11:47:00 +0100
  • 3eca5d2e1c - Miriam Baglioni 2023-12-18 09:55:27 +0100
  • 01ce0b9c76 [doiboost - preprocess] remove transition to orcid preparation from sequence of steps at the beginning of the workflow Miriam Baglioni 2023-12-15 12:24:55 +0100
  • 0d8e496a63 - Miriam Baglioni 2023-12-15 12:16:43 +0100
  • c4ec35b6cd Merge pull request 'Master branch updates from beta December 2023' (#369) from beta_to_master_dicember2023 into master Claudio Atzori 2023-12-15 11:18:30 +0100
  • 1726f49790 code formatting Claudio Atzori 2023-12-15 10:37:02 +0100
  • a59be5779e Merge pull request '9078_xml_records_irish_tender' (#368) from 9078_xml_records_irish_tender into beta Claudio Atzori 2023-12-12 12:34:43 +0100
  • ff924215b8 [graph provision] added tests for new peerreviewed field Claudio Atzori 2023-12-12 11:21:30 +0100
  • a6d635e695 Merge branch 'beta' into 9078_xml_records_irish_tender Claudio Atzori 2023-12-12 11:06:42 +0100
  • 98cce5bfb2 code formatting Claudio Atzori 2023-12-12 09:59:05 +0100
  • 84d54643cf [cleaning] allow enriched orcids to pass the cleaning, rule out non-orcid author pids Claudio Atzori 2023-12-12 09:57:00 +0100
  • 7e8eff40c1 [graph provision] added tests for the new model fields Claudio Atzori 2023-12-12 08:54:15 +0100
  • 8752d275fa removed not needed parameter Miriam Baglioni 2023-12-09 15:24:45 +0100
  • d4eedada71 adjusting workflow definition Miriam Baglioni 2023-12-09 15:20:11 +0100
  • aba95ed1d1 code formatting Claudio Atzori 2023-12-08 17:06:19 +0100
  • 2877839df0 Merge pull request '[graph cleaning] added cleaning for result.publisher and result.instance.license' (#366) from clean_license_publisher into beta Claudio Atzori 2023-12-08 16:58:37 +0100
  • 34abd0fc43 Merge branch 'beta' into clean_license_publisher Claudio Atzori 2023-12-08 16:58:27 +0100
  • cb71a7936b [graph cleaning] avoid stack overflow error when navigating Oaf objects declaring an Enum Claudio Atzori 2023-12-07 23:09:54 +0100
  • 70eb1796b2 logging typo Claudio Atzori 2023-12-07 14:08:04 +0100
  • c381bacee0 [enrichment] passing the community API base URL Claudio Atzori 2023-12-07 14:07:11 +0100
  • 336fb31d87 [community_result_propagation] adjusting starting poit of workflow Miriam Baglioni 2023-12-07 10:27:25 +0100
  • c0cde53bf6 [bulktagging] setting first step of bulktaggin as the copy of the entities and relations not involved in the tagging' Miriam Baglioni 2023-12-07 10:08:35 +0100
  • 616622d2bb first version of the workflow single step Miriam Baglioni 2023-12-07 09:59:52 +0100
  • 259c69e446 [orcid enrichment] fixed workflow definition Claudio Atzori 2023-12-06 19:41:53 +0100
  • 431c6bb08a [dedup] added isLookupUrl to the graph consistency workflow definition, required now by the entity grouping phase Claudio Atzori 2023-12-06 11:06:46 +0100
  • 613ec5ffce Add profiles for different spark versions: spark-24, spark-34, spark-35 Giambattista Bloisi 2023-09-21 14:23:37 +0200
  • 52495f2cd2 used javax.xml.stream.XMLEventReader instead of deprecated scala.xml.pull.XMLEventReader Sandro La Bruzzo 2023-09-18 13:58:22 +0200
  • 8c3e9a09d3 added repository openaire-third-parties Sandro La Bruzzo 2023-09-18 12:51:18 +0200
  • 2fa78f6071 Changes requires to build and run tests with Java 17 Giambattista Bloisi 2023-09-07 11:58:59 +0200
  • 326c9dc08c Changes in maven poms to build and test the project using Spark 3.4.x and scala 2.12 Giambattista Bloisi 2023-08-02 18:05:53 +0200
  • 982c0c110b Merge pull request '[graph provision] added serialization for the new fields imported from the stats DB' (#365) from 9078_xml_records_irish_tender into beta Claudio Atzori 2023-12-05 16:39:44 +0100
  • 321922772b added serialization for the new fields imported for the Irish tender Claudio Atzori 2023-12-05 16:37:04 +0100
  • c5b7253130 [community_organization propagation] fixed workflow parameters Claudio Atzori 2023-12-05 09:13:33 +0100
  • 3c3bdb8318 [bulktagging] fixed workflow parameters Claudio Atzori 2023-12-05 09:08:48 +0100
  • 7c3041b276 avoid NPEs Claudio Atzori 2023-12-03 16:49:49 +0100
  • 74b185d07b avoid NPEs Claudio Atzori 2023-12-03 16:18:20 +0100
  • e6086efc53 avoid NPEs in Vocabulary.getTermBySynonym Claudio Atzori 2023-12-03 13:33:20 +0100
  • 2a233a89aa [graph grouping] added isLookupUrl to the workflow definition, passed to the grouping spark aciton Claudio Atzori 2023-12-03 13:32:52 +0100
  • 178a14c491 code formatting Claudio Atzori 2023-12-03 13:31:58 +0100