Commit Graph

431 Commits

Author SHA1 Message Date
Sandro La Bruzzo 2b48a2c32c Merge branch 'doiboost' of code-repo.d4science.org:D-Net/dnet-hadoop into doiboost 2020-05-11 09:38:36 +02:00
Sandro La Bruzzo 4cebca09d2 start implementing MAG mapping 2020-05-11 09:38:27 +02:00
Enrico Ottonello b9d126dd1f formatting modified after commit 2020-05-08 14:54:37 +02:00
Enrico Ottonello 7e1c987370 Merge branch 'doiboost' of https://code-repo.d4science.org/D-Net/dnet-hadoop into doiboost 2020-05-08 14:49:50 +02:00
Enrico Ottonello 9d812788e4 added job to download from orcid the records modified after a fixed date, the info are taken from last_modified.csv on hdfs 2020-05-08 14:49:39 +02:00
Sandro La Bruzzo 1e06bbaee8 fixed test 2020-04-30 11:38:58 +02:00
Sandro La Bruzzo b8e95295e2 merged from master 2020-04-30 11:27:59 +02:00
Michele Artini eb9bd42970 fixed a problem with journals 2020-04-30 11:06:05 +02:00
Michele Artini a0a6109bbc fixed a problem with journals 2020-04-30 11:03:46 +02:00
Claudio Atzori 439c6255a2 cleanup 2020-04-29 19:09:07 +02:00
Claudio Atzori 77ac995770 cleaned up poms, added descriptions 2020-04-29 18:44:17 +02:00
Claudio Atzori 8fd81e863d added default value for the external_stats_db_name 2020-04-29 15:36:24 +02:00
Claudio Atzori c6f3ff4462 stats workflow content relocated into common package; added <global> property definitions in stats workflow.xml 2020-04-29 14:29:27 +02:00
Sandro La Bruzzo 4a89465740 reformatted code 2020-04-29 13:24:29 +02:00
Sandro La Bruzzo a6b1a59d0a merged with maaster 2020-04-29 13:20:57 +02:00
Sandro La Bruzzo 920c0f19c3 Merge branch 'doiboost' of code-repo.d4science.org:D-Net/dnet-hadoop into doiboost 2020-04-29 13:13:16 +02:00
Sandro La Bruzzo 09f161f1f4 implemented unit test 2020-04-29 13:13:02 +02:00
miconis e0d14fe4f8 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-04-29 13:02:53 +02:00
miconis 0352d3b0ba entity dumps in dedup compressed 2020-04-29 13:02:34 +02:00
Michele Artini c43b4c8962 formatting 2020-04-29 12:56:58 +02:00
Michele Artini a5d7007005 Fix relations in migration
Fix pom.xml in dhp-stats-update
2020-04-29 12:05:41 +02:00
Claudio Atzori 3616d0f88d Merge pull request 'Adding the stats workflow to the dnet-hadoop hierarchy' (#6) from spyros/dnet-hadoop:master into master
Integrating stats update workflow.
2020-04-29 10:35:02 +02:00
Claudio Atzori 964972d29a added data provision workflow definition WIP 2020-04-29 09:25:50 +02:00
Enrico Ottonello 1edcd53581 added shell actions to download all 11 activities files from ORCID 2020-04-28 20:25:09 +02:00
miconis 62e467eb0c assertion numbers updated to fit the new implementation of the pace-core 2020-04-28 11:46:23 +02:00
Claudio Atzori 6f5b899038 reformatted code according to the updated style descriptor 2020-04-28 11:23:29 +02:00
Claudio Atzori ac25f2d8d1 integrated changes from master 2020-04-28 08:55:28 +02:00
Claudio Atzori a0bdbacdae switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin 2020-04-27 14:52:31 +02:00
Claudio Atzori 7a3f8085f7 switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin 2020-04-27 14:45:40 +02:00
Michele Artini 1260d03eba skip empty projects 2020-04-27 13:51:13 +02:00
Enrico Ottonello a1861b9eaa workflow works in parallel on 2 activity files 2020-04-24 18:33:37 +02:00
Enrico Ottonello 941e94af06 added workflow for generating authors with dois data sequence file 2020-04-24 15:50:40 +02:00
Claudio Atzori 268462623a refined definition of equals and hash methods for Oaf model classes, now based on entity identifier, while relations consider sourceid, targetid and relationship semantic; Factored out function to group Oaf objects in grouping operations; Raw graph creation procedure merges entities and relationships providing the same identity 2020-04-24 14:42:01 +02:00
Claudio Atzori a3e480d1c9 implmented DispatchEntitiesApplication using spark2 datasets 2020-04-24 14:36:53 +02:00
Claudio Atzori 48157e0fc4 GraphHiveImporterJob moved in dedicate package 2020-04-24 14:32:28 +02:00
Claudio Atzori 278fc9d276 code formatting 2020-04-23 18:51:38 +02:00
miconis 5414236644 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-04-23 18:17:23 +02:00
miconis 8d258c85ff spark dedup test fixed, sample for dataset and orp added, test implemented 2020-04-23 18:16:20 +02:00
Michele Artini 072eae3803 fixed a problem with missing contexts 2020-04-23 16:42:49 +02:00
Michele Artini b164d96874 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-04-23 16:19:16 +02:00
Michele Artini d920ce501e fixed a problem with missing instances 2020-04-23 16:18:40 +02:00
Sandro La Bruzzo fdc0523e4c Merge remote-tracking branch 'origin/master' into doiboost 2020-04-23 09:34:13 +02:00
Sandro La Bruzzo 4ba386d996 improved crossref mapping 2020-04-23 09:33:48 +02:00
Claudio Atzori 8851050814 replaced hive_db_name with hiveDbName 2020-04-23 08:36:40 +02:00
Claudio Atzori 91f81107b1 applying code formatting 2020-04-23 07:52:32 +02:00
Claudio Atzori 1e7583c5a6 filtered invisible records in data provision workflow 2020-04-23 07:51:34 +02:00
Claudio Atzori 9ddafd46ca fixed dedup record id prefix, set the correct dataInfo in the DedupRecordFactory 2020-04-23 07:50:18 +02:00
Claudio Atzori ade4cb97af fixed parameters passed to the postprocessing action in the workflow mapping the graph as hive DB 2020-04-22 18:24:06 +02:00
Sandro La Bruzzo bb6c9785b4 Merge remote-tracking branch 'origin/master' into doiboost 2020-04-22 15:00:57 +02:00
Sandro La Bruzzo 157915988c improved crossref mapping 2020-04-22 15:00:44 +02:00