1
0
Fork 0
Commit Graph

4543 Commits

Author SHA1 Message Date
Giambattista Bloisi 5e15f20e6e Fix entityMerger that was excluding the authors of the first entity in the list to merge 2023-07-21 00:46:54 +02:00
Giambattista Bloisi 0210a14e43 Ignore timestamp differences in PromoteActionPayloadForGraphTableJobTest 2023-07-20 23:45:57 +02:00
Giambattista Bloisi dba34505de Fix SparkStatsTest bug where parquet tables were incorrectly read as text files leading to unpredictable count() values 2023-07-19 14:24:52 +02:00
Giambattista Bloisi e47ed1fdb2 Use DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES in json mapper to avoid that tests fail if they encounter unmapped properties 2023-07-19 14:21:40 +02:00
Giambattista Bloisi 38dfebfbe6 Disable MdStoreClientTest test as it requires a local mongodb running and it does not perform any assertions 2023-07-19 14:18:56 +02:00
Giambattista Bloisi ef493681d9 Merge pull request 'Import dnet-pace-core module in this project and use it after renaming to dhp-pace-core' (#319) from beta_with_pace_core into beta
Reviewed-on: D-Net/dnet-hadoop#319
2023-07-11 14:03:15 +02:00
Giambattista Bloisi 801da2fd4a New sources formatted by maven plugin 2023-07-06 10:28:53 +02:00
Giambattista Bloisi bd3fcf869a rename dnet-pace-core into dhp-pace-core module and use it as dependency in other modules 2023-07-06 10:02:23 +02:00
Giambattista Bloisi 3b35db5fbd Import dnet-pace-core module from dnet-dedup repository 2023-07-05 22:23:06 +02:00
Miriam Baglioni 7738372125 [UsageCount] fixed typo in attribute name for datasource table 2023-06-30 18:56:41 +02:00
Sandro La Bruzzo 9963fd6d29 updated log to add subentity 2023-06-28 13:36:05 +02:00
Sandro La Bruzzo ed7e2ab6d1 reverted mistake on commit workflow.xml 2023-06-28 11:40:19 +02:00
Sandro La Bruzzo 9910ce06ae added to CreateSimRel the feature to write time log 2023-06-28 11:38:16 +02:00
Miriam Baglioni 2717edafb7 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-06-28 11:25:14 +02:00
Miriam Baglioni 2f04c9d149 [BulkTagging] fixing left over for test 2023-06-28 11:24:42 +02:00
Sandro La Bruzzo bd17c3edc8 added to CreateSimRel the feature to write time log 2023-06-28 11:20:58 +02:00
Sandro La Bruzzo b195da3a83 Added utility to write time logs during the deduplication phase 2023-06-28 11:20:09 +02:00
Michele Artini 88a1cbc37d fixed a datasource id 2023-06-22 07:56:33 +02:00
Claudio Atzori b0ebf56367 Merge pull request 'Update step15_5.sql' (#314) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#314
2023-06-21 10:33:22 +02:00
dimitrispie 2b6370eaee Update step15_5.sql
Bug fix
2023-06-21 11:31:10 +03:00
Claudio Atzori 35e42a86ed Merge pull request 'Update step15_5.sql' (#313) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#313
2023-06-21 10:26:16 +02:00
dimitrispie 74cb060bfe Update step15_5.sql
Add "if not exists" clause
2023-06-21 11:24:06 +03:00
Claudio Atzori 85e016df17 Merge pull request 'Update step16-createIndicatorsTables.sql' (#312) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#312
2023-06-21 09:52:33 +02:00
dimitrispie a475cfcb7b Update step16-createIndicatorsTables.sql
Rename a field in indi_pub_interdisciplinarity
2023-06-21 10:42:02 +03:00
Claudio Atzori 979cf9cd87 Merge pull request 'Update step15.sql' (#311) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#311
2023-06-21 09:20:01 +02:00
dimitrispie 4648cd88d4 Update step15.sql
Cast score to double
2023-06-21 10:02:19 +03:00
dimitrispie 94d2573c77 Update step15.sql
Bug Fix
2023-06-21 09:22:39 +03:00
Claudio Atzori 0561362de2 Merge pull request 'Update step20-createMonitorDB_institutions.sql' (#309) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#309
2023-06-20 15:07:09 +02:00
Claudio Atzori 50d7dc0078 [graph enrichment] fixed projectOrganizationPath not being passed to the apply_resulttoorganization_propagation node 2023-06-19 15:42:44 +02:00
Claudio Atzori fbd9bf704e indent 2023-06-19 15:41:22 +02:00
Claudio Atzori 6210f6ee48 Merge pull request 'Precompile blacklists patterns before evaluating clustering criteria' (#1) from optimized-clustering into master
Reviewed-on: D-Net/dnet-dedup#1
2023-06-19 12:43:49 +02:00
dimitrispie be2caedb04 Update step20-createMonitorDB_institutions.sql
Add openorgs____::1624ff7c01bb641b91f4518539a0c28a Vrije Universiteit Amsterdam
2023-06-19 12:12:17 +03:00
dimitrispie 36e0a8fec4 Changes to Promotion Stats WF
1. Add new cluster host at impala-shell commands
2. Add a step for splitting monitor dbs
3. Update workflow.xml to included the new splitting monitor dbs step
2023-06-19 09:44:34 +03:00
Giambattista Bloisi b0ade43608 Precompile blacklists patterns before evaluating clustering criteria
Enable Junit 5 tests in maven builds
Make path comparisons platform-independent
Read String resource files assuming they are encoded in UTF-8
Fix a few test conditions
2023-06-16 09:41:11 +02:00
dimitrispie 4c770a5e29 Update finalizeImpalaCluster.sh
Drop views in shadow dbs before dropping the db
2023-06-15 13:25:37 +03:00
dimitrispie e06d962a6a Update step15.sql 2023-06-15 12:20:35 +03:00
dimitrispie afcad08396 Update step20-createMonitorDB_institutions.sql
Added openorgs____::c0b262bd6eab819e4c994914f9c010e2   -- National Institute of Geophysics and Volcanology
2023-06-15 10:28:49 +03:00
Claudio Atzori b9748763e2 Merge pull request '[stats wf] Bug fixes' (#308) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#308
2023-06-14 21:57:03 +02:00
dimitrispie 42b8ce2ba4 Update copyDataToImpalaCluster.sh 2023-06-14 19:23:42 +03:00
dimitrispie 2032b0df40 Bug fixes
1. Remove tables/views from old databases in the new cluster, before dropping the dbs
2. Fix id in result_accessroute, indi_impact_measures, indi_pub_bronze_oa
2023-06-14 19:09:09 +03:00
Claudio Atzori b76a47b103 [aggregator graph] added column alias when mapping organization PIDs from the OpenOrgs database 2023-06-13 11:38:10 +02:00
Claudio Atzori 744a61a030 depending on dhp-schema:3.17.1 2023-06-12 13:49:44 +02:00
Claudio Atzori 2e4616a251 Merge pull request '[graph cleaning] pid cleaning' (#307) from pid_cleaning into beta
Reviewed-on: D-Net/dnet-hadoop#307
2023-06-12 13:32:29 +02:00
Claudio Atzori d6a8b24711 Merge branch 'beta' into pid_cleaning 2023-06-12 13:32:22 +02:00
Claudio Atzori fdbfb25614 Merge pull request 'update sql query to return distinct pids [beta]' (#306) from distinct_pids_from_openorgs_beta into beta
Reviewed-on: D-Net/dnet-hadoop#306
2023-06-12 09:59:00 +02:00
Claudio Atzori ad04f14b81 Merge branch 'beta' into distinct_pids_from_openorgs_beta 2023-06-12 09:58:21 +02:00
Claudio Atzori a98e6591e2 Merge pull request 'propagation of projects through parent-child relations' (#299) from propagationProjectThroughParentChils into beta
Reviewed-on: D-Net/dnet-hadoop#299
2023-06-12 09:57:20 +02:00
Claudio Atzori 55f002f1e9 Merge branch 'beta' into propagationProjectThroughParentChils 2023-06-12 09:56:53 +02:00
Claudio Atzori daa21ddbb5 Merge pull request '[aggregator graph] validation for URLs from oaf:fulltext' (#298) from fulltext_url_validation into beta
Reviewed-on: D-Net/dnet-hadoop#298
2023-06-12 09:55:35 +02:00
Claudio Atzori 4b00a76271 Merge branch 'beta' into fulltext_url_validation 2023-06-12 09:55:25 +02:00