Commit Graph

140 Commits

Author SHA1 Message Date
Miriam Baglioni 8aa3b4d7c0 adding to propagation constants the ones needed for propagation of project to result and addition of new accumulator Set in typed row to collect values of a type 2020-02-19 14:55:54 +01:00
Miriam Baglioni 7167673a58 implementation and configuration for propagation of project to result through semantic relation: P -> R1 and R1 -> supplemented by -> R2 => P -> R2 2020-02-19 14:54:18 +01:00
Miriam Baglioni b81e6af429 added config for new propagation 2020-02-18 17:30:44 +01:00
Miriam Baglioni b736a9581c changed relclass and reltype in reelation specification for country propagation and implementation of propagation of result affiliation through institutional repositories 2020-02-18 17:27:28 +01:00
Miriam Baglioni ed262293a6 aligned to new snapshot version 1.1.6 2020-02-18 17:25:32 +01:00
Miriam Baglioni 2688a89c21 changed relclass and reltype in relation specification 2020-02-18 17:24:40 +01:00
Miriam Baglioni c0022fec9f moved on upper package to serve other propagations 2020-02-18 17:24:11 +01:00
Miriam Baglioni e0a777028a fix problem in parameters 2020-02-18 17:23:34 +01:00
Miriam Baglioni 5868ff8a86 synch fork with master 2020-02-17 18:22:27 +01:00
Miriam Baglioni 18e4092d5c change name of properties dir 2020-02-17 18:07:06 +01:00
Miriam Baglioni bd0e504b42 changes to the wf configuration 2020-02-17 18:04:15 +01:00
Miriam Baglioni 3a9d723655 adding default parameters in code 2020-02-17 16:30:52 +01:00
Claudio Atzori 6a288625e5 fixed workflow outgoing node 2020-02-17 15:04:33 +01:00
Miriam Baglioni a5517eee35 adding the mkdirs for creation of propagation folder under provision on tmp 2020-02-17 14:20:42 +01:00
Miriam Baglioni 9abde5cfac removed outputPath from job parameters 2020-02-17 14:19:53 +01:00
Claudio Atzori 1b18fd4d54 sync with master branch 2020-02-17 13:49:46 +01:00
Sandro La Bruzzo 4f04759738 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-02-17 12:31:58 +01:00
Sandro La Bruzzo 76ee85141a added oozie job for DNET migration and implemented Spark job for extracting entities 2020-02-17 12:31:44 +01:00
Miriam Baglioni be2421d5d8 removed wrongly pushed file 2020-02-17 12:07:26 +01:00
Claudio Atzori c460e2d281 Aggiornare 'dhp-workflows/docs/oozie-installer.markdown' 2020-02-17 11:54:48 +01:00
Miriam Baglioni c7bc73aedf country propagation for results collected from institutional repositories 2020-02-17 11:44:48 +01:00
Michele Artini 176c5606bd aligned with origin/master, aligned model and mapping 2020-02-17 10:40:53 +01:00
Claudio Atzori 56d1810a66 working procedure for records indexing using Spark, via lib com.lucidworks.spark:spark-solr 2020-02-14 12:28:52 +01:00
Claudio Atzori 1ee1baa8c0 Merge branch 'master' into provision_indexing 2020-02-13 18:17:07 +01:00
Claudio Atzori a3d0b57b25 [maven-release-plugin] prepare for next development iteration 2020-02-13 18:11:33 +01:00
Claudio Atzori 6ed9a15bc8 [maven-release-plugin] prepare release dhp-1.1.5 2020-02-13 18:11:31 +01:00
Claudio Atzori 49e648f7c3 bumped version 2020-02-13 18:09:31 +01:00
Claudio Atzori f9fae97e09 test json files aligned with the latest model changes 2020-02-13 18:05:59 +01:00
Claudio Atzori 1fee6e2b7e implemented XML records construction and serialization, indexing WIP 2020-02-13 16:53:27 +01:00
Michele Artini 80cb52593f bug fixing 2020-02-13 15:34:13 +01:00
Michele Artini cdea0dae75 bug fixing 2020-02-12 16:34:00 +01:00
Michele Artini 69336195d3 simplifications 2020-02-12 11:12:38 +01:00
Michele Artini 06c2fd6df9 bug fixing 2020-02-11 15:29:50 +01:00
Michele Artini 5fc09b179c bug fixing 2020-02-11 12:48:03 +01:00
Michele Artini 95740767e0 Ready for tests 2020-02-10 16:04:06 +01:00
Michele Artini 181e8498d4 ... 2020-02-07 16:02:49 +01:00
Michele Artini bb1533a07e partial commit 2020-02-05 15:35:40 +01:00
Michele Artini fbb0fc140b partial implementation of migration 2020-02-04 15:25:47 +01:00
Claudio Atzori 7ba0f44d05 WIP 2020-01-30 18:21:07 +01:00
Claudio Atzori 49ef2f4eb1 removed input parameter specification, SparkXmlRecordBuilderJob doesn't need hive 2020-01-30 18:20:26 +01:00
Claudio Atzori b5e1e2e5b2 reintegrated changes from fcbc4ccd70 2020-01-30 18:11:04 +01:00
Claudio Atzori 7bacd6812e Merge branch 'provision_indexing' of https://code-repo.d4science.org/D-Net/dnet-hadoop into HEAD
 Conflicts:
	dhp-workflows/dhp-graph-provision/src/main/java/eu/dnetlib/dhp/graph/GraphJoiner.java
	dhp-workflows/dhp-graph-provision/src/main/java/eu/dnetlib/dhp/graph/MappingUtils.java
	dhp-workflows/dhp-graph-provision/src/main/java/eu/dnetlib/dhp/graph/RelatedEntity.java
	dhp-workflows/dhp-graph-provision/src/main/java/eu/dnetlib/dhp/graph/SparkXmlRecordBuilderJob.java
2020-01-30 17:59:46 +01:00
Claudio Atzori b2691a3b0a save adjacency list as JoinedEntity 2020-01-30 17:46:29 +01:00
Claudio Atzori 8c2aff99b0 joining entities using T x R x S, WIP: last representation based on LinkedEntity type 2020-01-29 15:40:33 +01:00
Claudio Atzori fcbc4ccd70 a bit of docs doesn't hurt 2020-01-24 08:43:23 +01:00
Claudio Atzori a55f5fecc6 joining entities using T x R x S method with groupByKey, WIP: making target objects (T) have lower memory footprint 2020-01-24 08:17:53 +01:00
Michele Artini 6bfe2dc96e partial implementation 2020-01-22 16:00:23 +01:00
Claudio Atzori 799929c1e3 joining entities using T x R x S method with groupByKey 2020-01-21 16:35:44 +01:00
Michele Artini f6eccdde33 partial implementation 2020-01-21 14:17:05 +01:00
Michele Artini cd114f1c3b partial update 2020-01-21 12:32:10 +01:00