Commit Graph

455 Commits

Author SHA1 Message Date
Miriam Baglioni bfe8f5335c GetCSV refactoring - copied model classes in test path 2021-08-12 17:58:14 +02:00
Miriam Baglioni 6e84b3951f GetCSV refactoring - moving classes to dhp-common that have dependency with GetCSV class (that was located in graph-mapper) 2021-08-12 17:57:41 +02:00
Miriam Baglioni 9650eea497 reverting 2021-08-11 17:45:48 +02:00
Miriam Baglioni cc3d72df0e removing not needed dependency 2021-08-11 17:42:01 +02:00
Miriam Baglioni f9b6b45d85 reverting 2021-08-11 17:04:48 +02:00
Miriam Baglioni 8da3a25cf6 merging with branch beta 2021-08-11 15:55:34 +02:00
Claudio Atzori 2ee21da43b suggestions from SonarLint 2021-08-11 12:13:22 +02:00
Miriam Baglioni 6bd1eca7e0 merge branch with beta 2021-08-05 15:23:32 +02:00
Miriam Baglioni ee13da9258 merge branch with master 2021-08-05 11:34:20 +02:00
Miriam Baglioni 1d6ac3715b merge branch with beta 2021-07-30 11:58:29 +02:00
Claudio Atzori a9961a1835 [cleaning] title cleaning based on the me.xuender:unidecode library 2021-07-28 16:36:33 +02:00
Claudio Atzori 6dddad86ee [cleaning] title cleaning based on the me.xuender:unidecode library 2021-07-28 16:21:29 +02:00
Miriam Baglioni 74f801b689 mergin with branch beta 2021-07-27 13:18:31 +02:00
Miriam Baglioni 35e395eae8 merge with master 2021-07-27 12:34:59 +02:00
Miriam Baglioni eb07f7f40f Hosted By Map 2021-07-27 12:27:26 +02:00
Claudio Atzori bc835d2024 [cleaning] fixed filtering function for missing titles 2021-07-23 11:56:13 +02:00
Claudio Atzori ffdb2a3ea3 [cleaning] fixed filtering function for missing titles 2021-07-23 11:55:55 +02:00
Sandro La Bruzzo 62ae36a3d2 fixed NPE 2021-07-22 15:41:38 +02:00
Miriam Baglioni 63553a76b3 added code to download gold issn list from unibi 2021-07-22 12:01:48 +02:00
Sandro La Bruzzo d94565862a fixed NPE 2021-07-21 21:23:11 +02:00
Sandro La Bruzzo 31d2d6d41e Scholexplorer: introduction of dedup openaire 2021-07-21 18:09:32 +02:00
Miriam Baglioni d418c309f5 removed the part after part-x- in the file name generated by spark. It was too long and created problems while creating the tar entries 2021-07-13 17:11:49 +02:00
Sandro La Bruzzo ad50415167 Merge remote-tracking branch 'origin/stable_ids' into stable_id_scholexplorer 2021-06-24 17:20:50 +02:00
Claudio Atzori 67afd06cd1 [cleaning] cleaning instance.pid and instance.alternateidentifier using the same procedure used to clean result.pid 2021-06-24 12:10:17 +02:00
Sandro La Bruzzo cc0f2b11fb Implemented mapping from pubmed baseline to OAF 2021-06-16 14:56:24 +02:00
Claudio Atzori 2039bb9f5f orcid / orcid_pending cleaning backported from master branch 2021-06-14 09:40:50 +02:00
Claudio Atzori a900bfb874 delegating the date parsing to https://github.com/sisyphsu/dateparser 2021-06-11 16:53:01 +02:00
Claudio Atzori eb6acfbabc [cleaning] removing non parsable relation.validationDate(s) 2021-05-28 10:50:44 +02:00
Claudio Atzori 9d725efdc1 reverted implementation of the mdstore client 2021-05-20 18:26:09 +02:00
Claudio Atzori 23b8883ab1 applied intellij code cleanup 2021-05-14 10:58:12 +02:00
Claudio Atzori d4c3476152 mapping datasource.journal only when an issn is available, null otherwhise 2021-05-11 11:08:54 +02:00
Claudio Atzori d1cbee8413 imported methods from CleaningFunctions, defined in GraphCleaningFunctions 2021-05-10 16:43:39 +02:00
Claudio Atzori 3797543600 MDStoreManager model classes moved in dhp-schemas 2021-05-10 14:32:05 +02:00
Claudio Atzori b1785ba77c alternative way to set timeouts for the ISLookup client 2021-05-05 11:23:46 +02:00
Claudio Atzori 923d19ea8e mdstore read lock/unlock when bulk copying records from mongodb to hdfs 2021-05-04 18:06:21 +02:00
Claudio Atzori 91e7220f20 cleaned up workflow for actionset migration, adjusted dnet|cnr* dependency versions 2021-04-29 10:09:52 +02:00
Claudio Atzori 5afa7d3e0c core utilities in dhp-common moved in external module dhp-schemas 2021-04-27 15:44:01 +02:00
Claudio Atzori f783e60ff7 cleanup 2021-04-27 14:04:50 +02:00
Claudio Atzori 27ab8a704d adjusted poms to align with the external dhp-schema module 2021-04-27 10:12:27 +02:00
Claudio Atzori c2bb03c8b5 depending on external dhp-schemas module 2021-04-23 17:57:35 +02:00
Claudio Atzori 8704d32780 code formatting 2021-04-15 16:52:58 +02:00
Claudio Atzori ba4b4c74d8 do not make the identifier prefix depend on the Handle 2021-04-15 16:48:26 +02:00
Claudio Atzori 710cd1e8f2 Merge pull request 'add xslt, personname cleaner' (#104) from andreas.czerniak/BrStableId_dnet-hadoop:stable_ids into stable_ids
Reviewed-on: #104

LGTM
2021-04-13 14:43:05 +02:00
Claudio Atzori d1ca025b0b [cleaning] remiving authors without fullname or providing 'deactivated' keyword. Removing test test titles 2021-04-13 14:32:41 +02:00
Andreas Czerniak d7614c1f85 introduce new const 2021-04-13 07:04:27 +02:00
Claudio Atzori 902d05f548 [cleaning] avoiding NPEs handling null author PIDs 2021-04-12 17:31:40 +02:00
Claudio Atzori 72ce741ea6 WIP: using common definitions from ModelConstants 2021-03-31 17:07:13 +02:00
Claudio Atzori 27681b876c code formatting 2021-03-29 17:47:11 +02:00
miconis 2709d08fc2 Merge branch 'stable_ids' into openorgswf 2021-03-29 16:39:07 +02:00
Claudio Atzori 3becaa5539 [Cleaning] drop alternate identifiers with empty values 2021-03-29 16:01:35 +02:00
Claudio Atzori 48f2b6127e [Cleaning] drop alternate identifiers with empty values 2021-03-29 14:23:18 +02:00
miconis 2355cc4e9b minor changes and bug fix 2021-03-29 10:07:12 +02:00
Claudio Atzori b5b7dc2104 [Cleaning] drop alternate identifiers with empty values 2021-03-26 12:30:00 +01:00
Claudio Atzori 827e7e37db [Cleaning] drop instance.alternateIdentifier elements when they are available among instance.pid 2021-03-25 11:07:59 +01:00
Claudio Atzori 431cbe9955 handle missing instance.pid during bulk cleaning 2021-03-23 09:28:58 +01:00
Sandro La Bruzzo c73072079d fix conflicts 2021-03-22 16:36:31 +01:00
Claudio Atzori 3256b9c836 code formatting 2021-03-19 09:36:12 +01:00
Claudio Atzori 75144dacb3 Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids 2021-03-19 09:07:40 +01:00
Claudio Atzori 9588bfba81 [cleaning] entries avaialbe as PIDs must not appear as alternateIdentifier 2021-03-19 09:07:30 +01:00
Sandro La Bruzzo 25d5663d97 added filter 2021-03-18 10:24:42 +01:00
Sandro La Bruzzo 5f98ea74a9 Added fix for pid generation in stableIds 2021-03-17 15:53:24 +01:00
Claudio Atzori 734232d3b9 identifier factory doesn't depend on pre-existing entity.id 2021-03-17 15:14:53 +01:00
Claudio Atzori a3dac32f16 pidFilter a bit more permissive 2021-03-17 15:06:05 +01:00
Claudio Atzori 8257f9a2bc result.pid: adjusted the mapping applied to the contents from the aggregator 2021-03-17 12:45:38 +01:00
Claudio Atzori 3b2da86f0a added precondition on IdentifierFactory to check the presence of entity.id 2021-03-16 17:05:38 +01:00
Claudio Atzori 640b885706 added instance.alternativeIdentifiers to the graph model, adjusted the mapping applied to the contents from the aggregator 2021-03-16 14:19:32 +01:00
Claudio Atzori f74e464942 create bestaccessright as Qualifier 2021-03-10 15:40:05 +01:00
Claudio Atzori c801ab6c1d minor 2021-03-09 17:22:31 +01:00
Claudio Atzori 9917d7e01c PID authorities include ArXiv 2021-03-09 17:12:52 +01:00
Claudio Atzori 01630f638d IdentifierFactory implementation based on the list of datasources authoritative for a given pid type 2021-03-09 17:11:50 +01:00
Claudio Atzori b3f3b895e5 [#6282 open access status in the Graph] OAStatus renamed as openAccessRoute 2021-03-09 11:41:11 +01:00
Claudio Atzori 765f9bdee7 merged from dhp_oaf_model 2021-03-09 11:37:41 +01:00
Claudio Atzori d525785497 [#6282 open access status in the Graph] Result.Instance.accessRight defined with dedicated data type that includes the open access color. 2021-03-09 11:12:55 +01:00
Claudio Atzori 8d2bb24512 merged from master 2021-03-08 15:44:34 +01:00
Claudio Atzori fa7930d2e2 merging contributions from PR#97 2021-03-05 15:45:28 +01:00
Claudio Atzori ec80b7ade3 code formatting 2021-03-03 10:22:53 +01:00
Claudio Atzori b73dce3e3a more logging on the MDStore mongodb client. Forcing UTF_8 encoding on the content 2021-03-03 10:17:16 +01:00
Claudio Atzori e76c4f62c1 MetadataRecord moved in dhp-schemas 2021-02-26 10:58:48 +01:00
Claudio Atzori b830e33392 mdstore collector plugin 2021-02-25 12:30:30 +01:00
Claudio Atzori dc98c39500 more logging 2021-02-25 12:29:18 +01:00
Claudio Atzori fc3fa5e343 implemented mdstore collector plugin 2021-02-24 15:07:24 +01:00
Claudio Atzori cf27905a71 WIP: collectorWorker error reporting, added report messages 2021-02-16 16:53:14 +01:00
Claudio Atzori 58288a95b8 WIP: collectorWorker error reporting, added report messages 2021-02-15 15:28:53 +01:00
Claudio Atzori 1abe6d1ad7 WIP: collectorWorker error reporting, added report messages 2021-02-15 15:08:59 +01:00
Claudio Atzori 29c6f7e255 classes related to the collection workflow moved into common package; implemented MongoDB collection plugins 2021-02-12 12:31:02 +01:00
Claudio Atzori 50add4c61b added requestDelay to HttpConnector2 configuration; Aggregation workflow constants moved in dhp-common 2021-02-08 12:19:38 +01:00
Claudio Atzori 40df0f987d better logging, WIP: collectorWorker error reporting; common functions moved in DHPUtils 2021-02-06 20:12:00 +01:00
Claudio Atzori a8a758925e better logging, WIP: collectorWorker error reporting 2021-02-05 19:18:05 +01:00
Michele Artini 2ee0c3e47e http entity as json string 2021-02-05 09:45:39 +01:00
Claudio Atzori 730973679a Merge branch 'hadoop_aggregator' of https://code-repo.d4science.org/D-Net/dnet-hadoop into hadoop_aggregator 2021-02-04 17:25:00 +01:00
Claudio Atzori deb85706db imported HttpConnector from https://svn.driver.research-infrastructures.eu/driver/dnet45/modules/dnet-modular-collector-service/trunk/src/main/java/eu/dnetlib/data/collector/plugins/HttpConnector.java as HttpConnector2 2021-02-04 17:24:52 +01:00
Sandro La Bruzzo 4dae5e605d implemented messaging btween collection worker and Dnet 2021-02-04 15:51:15 +01:00
Claudio Atzori 72c57b28fa switched project version to 1.2.4-branch_hadoop_aggregator-SNAPSHOT 2021-02-04 14:08:18 +01:00
Claudio Atzori 40764cf626 better logging, WIP: collectorWorker error reporting 2021-02-04 14:06:02 +01:00
Michele Artini 26d2eb946f messages sender 2021-02-04 09:45:46 +01:00
Michele Artini 1b9731632b Message Sender 2021-02-03 16:42:36 +01:00
Michele Artini 820d729e99 recover of Message and MessageType class 2021-02-03 16:20:34 +01:00
Claudio Atzori 0e8a4f9f1a better logging, WIP: collectorWorker error reporting 2021-02-03 12:33:41 +01:00
Claudio Atzori d62ea1490d cleaned up RabbitMQ stuff 2021-02-02 10:53:19 +01:00
Claudio Atzori 73d772a4b4 added method to list the known vocabulary names 2021-02-02 10:39:47 +01:00
Claudio Atzori 8eaa1fd4b4 WIP: metadata collection in INCREMENTAL mode and relative test 2021-02-01 19:29:10 +01:00
Sandro La Bruzzo 6ff234d81b Implemented a first prototype of incremental harvesting and trasformation using readlock 2021-02-01 13:56:05 +01:00
Sandro La Bruzzo 0276180039 WIP mdstore
transaction implemented on hadoop side
2021-01-29 16:42:41 +01:00
Michele Artini d942d0c77d methods toString(), hashCode() and equals() 2021-01-29 13:16:48 +01:00
Michele Artini 38f2508c87 new fields in mdstore beans 2021-01-28 08:24:45 +01:00
Sandro La Bruzzo a54848a59c Moved Vocabulary stuff to common module 2021-01-25 15:43:04 +01:00
Claudio Atzori 28460c2cd1 using com.fasterxml.jackson.databind.ObjectMapper instead of org.codehaus.jackson.map.ObjectMapper 2020-12-23 16:59:52 +01:00
Claudio Atzori 6848d0c3d7 trivial: avoid duplicated code 2020-12-23 12:21:58 +01:00
Claudio Atzori d8b5f43a7e code formatting 2020-12-22 14:59:03 +01:00
miconis 794e22b09c bug fix in the authormerge: now authors with higher size have priority, normalization of author name fixed 2020-12-21 17:51:42 +01:00
Claudio Atzori 12e2f930c8 resolved conflicts 2020-12-10 10:57:39 +01:00
Alessia Bardi 112da6d76a in theory, just auto-formatting after mvn compile 2020-12-09 20:00:27 +01:00
Miriam Baglioni 6fbc67a959 using ModelConstant.ORCID and removing not used constants 2020-12-09 17:10:20 +01:00
Claudio Atzori 3c5ce1dada code formatting 2020-12-09 17:07:20 +01:00
Miriam Baglioni 212b52614f added graph mapper versus community result without context and project in common to be used for the doiboost mapping 2020-12-09 16:59:02 +01:00
Claudio Atzori 491ad24750 introduced filtering for DOIs in graph cleaning workflow 2020-12-09 09:10:33 +01:00
Claudio Atzori 943b961cf6 introduced PidBlacklist 2020-12-02 09:30:34 +01:00
Claudio Atzori 893ac4a77b GenerateEntitiesApplication can be configured to hash the id value or not 2020-12-02 09:30:06 +01:00
Claudio Atzori 349e7246aa do not consider NCID, GBIF as PIDs candidate for the ID creation 2020-11-30 16:52:40 +01:00
Claudio Atzori 2c407e775e GenerateEntitiesApplication can be configured to hash the id value or not 2020-11-30 12:00:38 +01:00
Claudio Atzori 758d27745d cleaning tab characters from text fields 2020-11-27 16:07:24 +01:00
Claudio Atzori 596a2a459d added testing class for OafMapperUtils 2020-11-27 12:01:11 +01:00
Claudio Atzori fa66e5b6b8 ResultTypeComparator gives priority to Records collectedfrom Crossref 2020-11-26 13:09:19 +01:00
Claudio Atzori d0d5525d40 minor changes 2020-11-26 11:04:17 +01:00
Miriam Baglioni 66c0e3e574 changed because of #61 (comment) 2020-11-25 17:52:17 +01:00
Claudio Atzori 1372a4d1bf fixed merging method 2020-11-25 16:05:51 +01:00
Claudio Atzori dfd6205b95 Consistency graph workflow merges all the entities by ID 2020-11-25 14:55:32 +01:00
Claudio Atzori e1a1bb3ee4 moved class CleaningFunctions in the correct package. Remove newlines from titles, descriptions, subjects 2020-11-24 18:34:03 +01:00
Claudio Atzori e43ab07af6 code formatting 2020-11-24 14:41:39 +01:00
Miriam Baglioni 73dbb79602 removed the checl for the community name in the common version on MakeTar 2020-11-24 14:36:15 +01:00
Claudio Atzori c016cc050a IdentifierFactory: in case a record provides more than one pid of the same type, the the lexicographically lower value is chosen as best pick 2020-11-23 19:16:40 +01:00
Claudio Atzori 3f34757c63 merged from master 2020-11-19 14:34:54 +01:00
Claudio Atzori 2bed29eb09 WIP: added oozie workflow for grouping graph entities by id 2020-11-13 10:05:12 +01:00
Claudio Atzori 13e36a4da0 WIP: added oozie workflow for grouping graph entities by id 2020-11-13 10:05:02 +01:00
Claudio Atzori 9b0fb9e958 merged from master 2020-11-12 09:27:12 +01:00
Miriam Baglioni f8e9bda24c merge branch with master 2020-11-05 16:31:18 +01:00
Miriam Baglioni 7ebdfacee9 removed commented code and added documentation to new method 2020-11-05 16:30:36 +01:00
Claudio Atzori 4625b7486e code formatting 2020-11-04 18:12:43 +01:00
Claudio Atzori e5da4ee9b1 dedup workflow using the common PidComparator 2020-11-04 15:02:02 +01:00
Claudio Atzori ea2a0ea949 IdentifierFactory considers only DOIs matching a given regex 2020-11-03 18:43:37 +01:00
Miriam Baglioni d4382b54df moved the tar archive with maz size on common module 2020-11-03 16:54:50 +01:00
Claudio Atzori 86d6fbe95b refactoring: CleaningFunctions and OafMapperUtils moved in dhp-commong 2020-11-03 12:19:46 +01:00
Claudio Atzori 3fcd669e99 result merge operation leverage on custom ResultTypeComparator in the aggregator graph construction 2020-11-03 10:53:23 +01:00
Claudio Atzori 78c3c1b62b exclude pid values set to 'none' 2020-11-02 14:25:26 +01:00
Claudio Atzori 09e44dabff Merge branch 'master' into stable_ids 2020-11-02 12:16:01 +01:00
Miriam Baglioni 10d8bbada8 changed deprecated method with non deprecated versioen 2020-10-30 14:10:10 +01:00
Claudio Atzori 58f28296ea ProvisionConstants moved as ModelHardLimits in dhp-common and applied to truncate long abstracts (len > 150000). Further filtering for empty PID values 2020-10-30 10:56:42 +01:00
Miriam Baglioni 4cf4454341 changed from deprecated method to new one 2020-10-27 17:46:19 +01:00
Miriam Baglioni c8f32dd109 - 2020-10-27 17:45:58 +01:00
Miriam Baglioni 3582eba565 - 2020-10-27 17:31:33 +01:00
Miriam Baglioni 3241ec1777 added connection timeout and socket timeout 600 sec 2020-10-27 16:12:11 +01:00
Miriam Baglioni cc68855a1e merge upstream 2020-10-27 15:54:16 +01:00
Miriam Baglioni 1cb60aede4 added connection timeout and socket timeout 600 sec 2020-10-27 15:53:02 +01:00
sandro 3a81a940b7 solved bug on merge publication 2020-10-21 22:41:55 +02:00
Claudio Atzori c188868450 Merge branch 'master' into stable_ids 2020-10-16 12:06:23 +02:00
Miriam Baglioni 959f30811e added connection timeout and socket timeout 600 sec 2020-10-16 10:52:30 +02:00
Sandro La Bruzzo 734934e2eb fixed error on empty intersection with publication and relation on export to OAF 2020-10-08 17:29:29 +02:00
Sandro La Bruzzo eec418cd26 moved AuthoreMerger into dhp-common 2020-10-08 10:33:55 +02:00
Claudio Atzori 8958f20813 code formatting 2020-10-07 13:14:31 +02:00
Claudio Atzori 1abcabb6e6 WIP stable ids: IdentifierFactory & unit test 2020-10-06 18:55:23 +02:00
Claudio Atzori 6ce340bd3d WIP stable ids: IdentifierFactory 2020-10-06 15:44:53 +02:00
Claudio Atzori 49ae3450a9 code formatting 2020-10-02 09:43:24 +02:00
Claudio Atzori 1c44182dea minor changes 2020-10-02 09:41:34 +02:00
Miriam Baglioni ccd48dd78a added new test for new method 2020-09-25 16:33:43 +02:00
Miriam Baglioni 3e5497b336 added new method to handle an open deposition to which upload data 2020-09-25 16:33:15 +02:00
Claudio Atzori 8a523474b7 code formatting 2020-09-07 11:40:16 +02:00
Miriam Baglioni c7f944a533 refactoring due to compilation 2020-08-19 10:01:26 +02:00
Miriam Baglioni 02a4986e7b Applying changed from code reviews #40 (comment) and #40 (comment) and #40 (comment) 2020-08-13 11:53:01 +02:00
Miriam Baglioni 33a6a51333 Disabled Test (impossible to publish without accessToken). And applying changes from code review #40 (comment) 2020-08-13 11:48:32 +02:00
Miriam Baglioni 306603272e removed accession token 2020-08-12 09:39:58 +02:00
Miriam Baglioni 30a2b19b65 changed metadata for deposition od covid-19 dump in Zenodo 2020-08-11 17:36:56 +02:00
Miriam Baglioni 10e2af3d3b - 2020-08-11 16:08:06 +02:00
Miriam Baglioni 0603ec4757 changed test to upload the dump for covid-19 community 2020-08-11 15:43:25 +02:00
Miriam Baglioni ecd2081f84 refactoring 2020-08-11 14:17:31 +02:00
Miriam Baglioni a6df01d329 - 2020-08-10 11:59:10 +02:00
Miriam Baglioni bcc70dce5e added dependency to POM 2020-08-07 16:46:18 +02:00
Miriam Baglioni 7e54b2189c resources for the test of the automatic deposition in Zenodo 2020-08-07 16:45:59 +02:00
Miriam Baglioni d161fa2aeb test for the automatic deposition in Zenodo 2020-08-07 16:45:28 +02:00
Miriam Baglioni 545ea9f77e moved in common. Zenodo response model and APIClient to deposit in Zenodo 2020-08-07 16:44:51 +02:00
Claudio Atzori 93052ae384 WIP: set the connect & request timeout for BindingProvider service implementation 2020-06-25 16:16:02 +02:00
Claudio Atzori 9cd27183b6 [maven-release-plugin] prepare for next development iteration 2020-06-22 11:27:44 +02:00
Claudio Atzori 1e3dab0631 [maven-release-plugin] prepare release dhp-1.2.3 2020-06-22 11:27:39 +02:00
Claudio Atzori 0d52816244 WIP: graph cleaner implementation 2020-06-13 13:06:04 +02:00
Claudio Atzori c4d9f1837f [maven-release-plugin] prepare for next development iteration 2020-06-12 12:21:08 +02:00
Claudio Atzori f0746a7605 [maven-release-plugin] prepare release dhp-1.2.2 2020-06-12 12:21:03 +02:00
Claudio Atzori 463489f59f code formatting 2020-06-12 12:03:25 +02:00
miconis fa8c5bcd39 javadoc for the PacePerson class and implementation of a unit test 2020-06-11 12:19:32 +02:00
Miriam Baglioni 54d869e618 merge upstream 2020-05-26 09:22:04 +02:00
Claudio Atzori 7582532e73 [maven-release-plugin] prepare for next development iteration 2020-05-25 19:48:18 +02:00
Claudio Atzori 01c2e93395 [maven-release-plugin] prepare release dhp-1.2.1 2020-05-25 19:48:14 +02:00
Miriam Baglioni 8f6ce970f9 moved PacePerson to dhp-common to avoid conflict in dependency with graph-mapper 2020-05-25 10:25:55 +02:00
Miriam Baglioni 2abb84877d Merge branch 'master' into blacklist 2020-05-11 10:37:49 +02:00
Miriam Baglioni 871e079b45 merged with master 2020-05-11 10:20:00 +02:00
Claudio Atzori 60c40618d3 [maven-release-plugin] prepare for next development iteration 2020-05-11 10:17:14 +02:00
Claudio Atzori c267d958d5 [maven-release-plugin] prepare release dhp-1.2.0 2020-05-11 10:17:10 +02:00
Claudio Atzori 42f1a2bf94 bumped project version to 1.2.0-SNAPSHOT 2020-05-11 10:05:57 +02:00
Claudio Atzori 0ccc864ad9 [maven-release-plugin] prepare for next development iteration 2020-05-08 17:01:31 +02:00
Claudio Atzori 6e47c724c6 [maven-release-plugin] prepare release dhp-1.1.7 2020-05-08 17:01:27 +02:00
Miriam Baglioni 4c94231cad merge with master fork 2020-05-08 12:25:57 +02:00
Miriam Baglioni 31ea05297d moved the DbClient to common and added needed dependency to pom 2020-05-04 12:22:28 +02:00
Claudio Atzori 439c6255a2 cleanup 2020-04-29 19:09:07 +02:00
Claudio Atzori 77ac995770 cleaned up poms, added descriptions 2020-04-29 18:44:17 +02:00
Miriam Baglioni f7695e833c resolved conflicts 2020-04-29 11:41:31 +02:00
Claudio Atzori 6f5b899038 reformatted code according to the updated style descriptor 2020-04-28 11:23:29 +02:00
Claudio Atzori a0bdbacdae switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin 2020-04-27 14:52:31 +02:00
Claudio Atzori 7a3f8085f7 switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin 2020-04-27 14:45:40 +02:00
Claudio Atzori ad7a131b18 introduced common project code formatting plugin, works on the commit hook, based on https://github.com/Cosium/git-code-format-maven-plugin, applied to each java class in the project 2020-04-18 12:42:58 +02:00
Claudio Atzori 038ac7afd7 relation consistency workflow separated from dedup scan and creation of CCs 2020-04-17 13:12:44 +02:00
Claudio Atzori 47f3d9b757 unit test for GraphHiveImporterJob 2020-04-08 13:24:43 +02:00
Claudio Atzori d74e128aa6 Utility classes moved in dhp-common and dhp-schemas 2020-04-07 11:56:22 +02:00
Claudio Atzori 3d1b637cab dataset based provision WIP 2020-04-04 14:03:43 +02:00
Claudio Atzori 377e1ba840 [maven-release-plugin] prepare for next development iteration 2020-03-30 20:06:00 +02:00
Claudio Atzori 76d9315129 [maven-release-plugin] prepare release dhp-1.1.6 2020-03-30 20:05:56 +02:00
Sandro La Bruzzo 0cd022ad6a merge with master 2020-03-26 14:08:29 +01:00
Claudio Atzori 19b2048109 code formatting 2020-03-25 17:40:38 +01:00
Michele Artini ebe45003d9 fixed some junit packages 2020-03-25 16:45:03 +01:00
Sandro La Bruzzo addaaa091f migrate relation from RDD to Dataset 2020-03-13 09:13:20 +01:00
Sandro La Bruzzo b021b8a2e1 Added index wf 2020-02-24 10:15:55 +01:00
Claudio Atzori 33185fd0b7 ISLookupClientFactory moved in dhp-common 2020-02-19 16:56:38 +01:00
Sandro La Bruzzo 2b8675462f refactoring code 2020-02-19 10:07:08 +01:00
Claudio Atzori 56d1810a66 working procedure for records indexing using Spark, via lib com.lucidworks.spark:spark-solr 2020-02-14 12:28:52 +01:00
Claudio Atzori 1ee1baa8c0 Merge branch 'master' into provision_indexing 2020-02-13 18:17:07 +01:00
Claudio Atzori a3d0b57b25 [maven-release-plugin] prepare for next development iteration 2020-02-13 18:11:33 +01:00
Claudio Atzori 6ed9a15bc8 [maven-release-plugin] prepare release dhp-1.1.5 2020-02-13 18:11:31 +01:00
Claudio Atzori 49e648f7c3 bumped version 2020-02-13 18:09:31 +01:00
Claudio Atzori 956da2f923 added Saxon-HE extension functions and Transformer factory class 2020-02-13 16:49:45 +01:00
Sandro La Bruzzo 19a80e4638 implemented workfow for aggregation and generation of infospace graph 2020-01-24 09:58:55 +01:00
Sandro La Bruzzo abd9034da0 implemented DedupRecord factory with the merge of publications 2019-12-11 15:43:24 +01:00
Claudio Atzori 7fe6835b47 [maven-release-plugin] prepare for next development iteration 2019-11-07 17:39:30 +01:00
Claudio Atzori 58918967d9 [maven-release-plugin] prepare release dhp-1.0.4 2019-11-07 17:39:27 +01:00
Claudio Atzori f39148dab8 [maven-release-plugin] prepare for next development iteration 2019-11-04 12:34:48 +01:00
Claudio Atzori 34b0e7b40a [maven-release-plugin] prepare release dhp-1.0.3 2019-11-04 12:34:46 +01:00
Sandro La Bruzzo fd0ad82111 [maven-release-plugin] prepare for next development iteration 2019-10-31 12:08:51 +01:00
Sandro La Bruzzo f224613b40 [maven-release-plugin] prepare release dhp-1.0.2 2019-10-31 12:08:49 +01:00
Sandro La Bruzzo e13c30cc96 [maven-release-plugin] rollback the release of dhp-1.0.2 2019-10-31 12:07:04 +01:00
Sandro La Bruzzo 4da5239203 [maven-release-plugin] prepare release dhp-1.0.2 2019-10-31 12:06:14 +01:00
Sandro La Bruzzo db8b346edd [maven-release-plugin] rollback the release of 1.0.1 2019-10-31 11:49:05 +01:00
Sandro La Bruzzo fc80052173 [maven-release-plugin] prepare for next development iteration 2019-10-31 11:47:42 +01:00
Sandro La Bruzzo 3150c7ce6d [maven-release-plugin] prepare release 1.0.1 2019-10-31 11:47:40 +01:00
Claudio Atzori c8bb81cd9a align dependencies with IIS cluster 2019-10-29 18:10:20 +01:00
Sandro La Bruzzo 5a8a323f2a dhp-collection-worker integrated in dhp-workflows 2019-10-24 11:36:59 +02:00
Sandro La Bruzzo c8e3e4d7c3 Refactoring dependencies versions 2019-10-24 10:20:31 +02:00
Sandro La Bruzzo bbb87d0e3d implemented saxonHE on transformation spark job 2019-10-10 11:33:51 +02:00
Sandro La Bruzzo 4b8c7c279d Added documentation on a class, and reused ArgumetApplicationParser on dhp-aggregation 2019-10-07 17:02:53 +02:00
Sandro La Bruzzo a423a6ebfd Created a generic Argument parser to be used in all modules 2019-10-03 12:22:44 +02:00
Sandro La Bruzzo 53ec9bccca changed the implemetation of RabitMQ Comunication 2019-04-16 12:28:01 +02:00
Sandro La Bruzzo 403c13eebf Implemented message manager, Fixed bug on collection worker, implemented Collecion and Transform spark job 2019-04-11 15:39:29 +02:00
Sandro La Bruzzo 9294851a6c implemented comunication layer using rabbitMq between oozie node and Dnet 2019-04-05 12:19:25 +02:00
Sandro La Bruzzo 3f4ba71bbd resolved conflicts 2019-04-03 16:12:57 +02:00
Sandro La Bruzzo ded6aef5e1 moved collector worker 2019-04-03 16:05:16 +02:00