Commit Graph

220 Commits

Author SHA1 Message Date
Michele Artini e7167b996a logs and closeable 2020-03-04 10:46:36 +01:00
Claudio Atzori 9cf5ce2e66 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-03-02 17:03:10 +01:00
Claudio Atzori bc7cfd5975 indexing workflow WIP: fixed projects fundingtree xml conversion, prioritized links between results and projects when limiting them to 100 in the join procedure 2020-03-02 17:03:07 +01:00
Michele Artini 4b29a121b0 migration using spark in step2 2020-03-02 16:12:14 +01:00
Michele Artini 5445a57102 migration using spark in step2 2020-03-02 16:11:59 +01:00
Claudio Atzori 60bc2b1a20 drop the hive DB before populating it from scratch 2020-02-27 10:10:55 +01:00
Michele Artini 689908b2e9 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-02-25 16:00:51 +01:00
Michele Artini 93665773ea Fixed a problem with JavaRDD Union 2020-02-25 15:59:21 +01:00
Claudio Atzori 6a73fd5da5 in order to reuse the same XmlRecordFactory across different tasks, the state of contexts must be one per record built 2020-02-21 09:17:19 +01:00
Michele Artini 4c94e74a84 Added a missing dependency 2020-02-20 11:43:32 +01:00
Michele Artini d49cd2fdc6 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-02-20 11:21:54 +01:00
Claudio Atzori d42dde52ba implemented method to merge relations 2020-02-19 17:29:05 +01:00
Claudio Atzori 5e5e32cb48 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-02-19 16:56:52 +01:00
Claudio Atzori 33185fd0b7 ISLookupClientFactory moved in dhp-common 2020-02-19 16:56:38 +01:00
Michele Artini 5d3739b5cf migration of claims 2020-02-19 15:11:17 +01:00
Michele Artini 173f1df1e5 saved a query for openaire production database 2020-02-19 10:15:08 +01:00
Sandro La Bruzzo 9a2d74ac82 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-02-19 10:13:45 +01:00
Sandro La Bruzzo e5d7cdf422 fixed sql query 2020-02-19 10:13:36 +01:00
Claudio Atzori ed76521d9b removed stale test resources, will be re-added later on 2020-02-18 11:51:08 +01:00
Claudio Atzori 0f364605ff removed stale tests, need to reimplemente them anyway 2020-02-18 11:48:19 +01:00
Claudio Atzori 6a288625e5 fixed workflow outgoing node 2020-02-17 15:04:33 +01:00
Claudio Atzori 1b18fd4d54 sync with master branch 2020-02-17 13:49:46 +01:00
Claudio Atzori 5bae30f399 adding readme for dhp-schema 2020-02-17 13:38:33 +01:00
Sandro La Bruzzo 4f04759738 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-02-17 12:31:58 +01:00
Sandro La Bruzzo 76ee85141a added oozie job for DNET migration and implemented Spark job for extracting entities 2020-02-17 12:31:44 +01:00
Claudio Atzori c460e2d281 Aggiornare 'dhp-workflows/docs/oozie-installer.markdown' 2020-02-17 11:54:48 +01:00
Sandro La Bruzzo fe93c709f1 Merge branch 'master' of michele.artini/dnet-hadoop into master 2020-02-17 10:43:08 +01:00
Michele Artini 176c5606bd aligned with origin/master, aligned model and mapping 2020-02-17 10:40:53 +01:00
Claudio Atzori 56d1810a66 working procedure for records indexing using Spark, via lib com.lucidworks.spark:spark-solr 2020-02-14 12:28:52 +01:00
Claudio Atzori 1ee1baa8c0 Merge branch 'master' into provision_indexing 2020-02-13 18:17:07 +01:00
Claudio Atzori a3d0b57b25 [maven-release-plugin] prepare for next development iteration 2020-02-13 18:11:33 +01:00
Claudio Atzori 6ed9a15bc8 [maven-release-plugin] prepare release dhp-1.1.5 2020-02-13 18:11:31 +01:00
Claudio Atzori 49e648f7c3 bumped version 2020-02-13 18:09:31 +01:00
Claudio Atzori f9fae97e09 test json files aligned with the latest model changes 2020-02-13 18:05:59 +01:00
Claudio Atzori 11cfd6bd9a integrated changes from master branch 2020-02-13 17:27:07 +01:00
Claudio Atzori bbf1b611b9 refereed, processingchargeamount and processingchargecurrency moved inside the Instance element. Introduced specific type to model Result's countries 2020-02-13 17:21:11 +01:00
Claudio Atzori 1fee6e2b7e implemented XML records construction and serialization, indexing WIP 2020-02-13 16:53:27 +01:00
Claudio Atzori 956da2f923 added Saxon-HE extension functions and Transformer factory class 2020-02-13 16:49:45 +01:00
Michele Artini 80cb52593f bug fixing 2020-02-13 15:34:13 +01:00
Michele Artini cdea0dae75 bug fixing 2020-02-12 16:34:00 +01:00
Michele Artini 69336195d3 simplifications 2020-02-12 11:12:38 +01:00
Michele Artini 06c2fd6df9 bug fixing 2020-02-11 15:29:50 +01:00
Michele Artini 5fc09b179c bug fixing 2020-02-11 12:48:03 +01:00
Michele Artini 95740767e0 Ready for tests 2020-02-10 16:04:06 +01:00
Sandro La Bruzzo 7f11d06a1f upgraded version of dnet-pace-core in pom.xml 2020-02-10 12:58:59 +01:00
Michele Artini 181e8498d4 ... 2020-02-07 16:02:49 +01:00
Michele Artini bb1533a07e partial commit 2020-02-05 15:35:40 +01:00
Michele Artini fbb0fc140b partial implementation of migration 2020-02-04 15:25:47 +01:00
Claudio Atzori d3b96f102b builder pattern screws up the Parquet schema inference method, avoid using it in the bean definitions 2020-02-04 14:10:58 +01:00
Claudio Atzori ed290ca8d7 builder pattern 2020-02-03 10:35:51 +01:00