Giambattista Bloisi
861c368e65
Code for testing other grouping strategies
2023-07-10 15:52:35 +02:00
Giambattista Bloisi
745e70e0d7
When generating similarities put as 'from' component the one with smaller lexicographic id
2023-07-10 15:45:49 +02:00
Giambattista Bloisi
dcc08cc512
Use UDAF and Aggregation class for testing
2023-07-07 12:35:30 +02:00
Giambattista Bloisi
df19548c56
small changes
2023-07-04 18:36:58 +02:00
Sandro La Bruzzo
890b49fb5d
optimized some dedup functions
2023-06-29 14:08:58 +02:00
Giambattista Bloisi
3129c1c48b
Allow processing of immutable sorted blocks in dedup
2023-06-28 14:01:04 +02:00
Giambattista Bloisi
cb7ad9889c
Fix maven dependencies warning while building
2023-06-28 14:01:04 +02:00
Claudio Atzori
75ff902f9d
WIP: various refactors
2023-06-28 14:00:54 +02:00
Claudio Atzori
326367eccc
WIP: various refactors
2023-06-28 14:00:22 +02:00
Claudio Atzori
521dd7f167
WIP: various refactors
2023-06-28 14:00:18 +02:00
Claudio Atzori
649679de8d
WIP: various refactors
2023-06-28 13:59:11 +02:00
Sandro La Bruzzo
4c2dfcbdf7
Added first implementation using UDF function
2023-06-28 13:58:01 +02:00
Sandro La Bruzzo
9963fd6d29
updated log to add subentity
2023-06-28 13:36:05 +02:00
Sandro La Bruzzo
ed7e2ab6d1
reverted mistake on commit workflow.xml
2023-06-28 11:40:19 +02:00
Sandro La Bruzzo
9910ce06ae
added to CreateSimRel the feature to write time log
2023-06-28 11:38:16 +02:00
Miriam Baglioni
2717edafb7
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2023-06-28 11:25:14 +02:00
Miriam Baglioni
2f04c9d149
[BulkTagging] fixing left over for test
2023-06-28 11:24:42 +02:00
Sandro La Bruzzo
bd17c3edc8
added to CreateSimRel the feature to write time log
2023-06-28 11:20:58 +02:00
Sandro La Bruzzo
b195da3a83
Added utility to write time logs during the deduplication phase
2023-06-28 11:20:09 +02:00
Michele Artini
88a1cbc37d
fixed a datasource id
2023-06-22 07:56:33 +02:00
Claudio Atzori
b0ebf56367
Merge pull request 'Update step15_5.sql' ( #314 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #314
2023-06-21 10:33:22 +02:00
dimitrispie
2b6370eaee
Update step15_5.sql
...
Bug fix
2023-06-21 11:31:10 +03:00
Claudio Atzori
35e42a86ed
Merge pull request 'Update step15_5.sql' ( #313 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #313
2023-06-21 10:26:16 +02:00
dimitrispie
74cb060bfe
Update step15_5.sql
...
Add "if not exists" clause
2023-06-21 11:24:06 +03:00
Claudio Atzori
85e016df17
Merge pull request 'Update step16-createIndicatorsTables.sql' ( #312 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #312
2023-06-21 09:52:33 +02:00
dimitrispie
a475cfcb7b
Update step16-createIndicatorsTables.sql
...
Rename a field in indi_pub_interdisciplinarity
2023-06-21 10:42:02 +03:00
Claudio Atzori
979cf9cd87
Merge pull request 'Update step15.sql' ( #311 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #311
2023-06-21 09:20:01 +02:00
dimitrispie
4648cd88d4
Update step15.sql
...
Cast score to double
2023-06-21 10:02:19 +03:00
dimitrispie
94d2573c77
Update step15.sql
...
Bug Fix
2023-06-21 09:22:39 +03:00
Claudio Atzori
0561362de2
Merge pull request 'Update step20-createMonitorDB_institutions.sql' ( #309 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #309
2023-06-20 15:07:09 +02:00
Claudio Atzori
50d7dc0078
[graph enrichment] fixed projectOrganizationPath not being passed to the apply_resulttoorganization_propagation node
2023-06-19 15:42:44 +02:00
Claudio Atzori
fbd9bf704e
indent
2023-06-19 15:41:22 +02:00
dimitrispie
be2caedb04
Update step20-createMonitorDB_institutions.sql
...
Add openorgs____::1624ff7c01bb641b91f4518539a0c28a Vrije Universiteit Amsterdam
2023-06-19 12:12:17 +03:00
dimitrispie
36e0a8fec4
Changes to Promotion Stats WF
...
1. Add new cluster host at impala-shell commands
2. Add a step for splitting monitor dbs
3. Update workflow.xml to included the new splitting monitor dbs step
2023-06-19 09:44:34 +03:00
dimitrispie
4c770a5e29
Update finalizeImpalaCluster.sh
...
Drop views in shadow dbs before dropping the db
2023-06-15 13:25:37 +03:00
dimitrispie
e06d962a6a
Update step15.sql
2023-06-15 12:20:35 +03:00
dimitrispie
afcad08396
Update step20-createMonitorDB_institutions.sql
...
Added openorgs____::c0b262bd6eab819e4c994914f9c010e2 -- National Institute of Geophysics and Volcanology
2023-06-15 10:28:49 +03:00
Claudio Atzori
b9748763e2
Merge pull request '[stats wf] Bug fixes' ( #308 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #308
2023-06-14 21:57:03 +02:00
dimitrispie
42b8ce2ba4
Update copyDataToImpalaCluster.sh
2023-06-14 19:23:42 +03:00
dimitrispie
2032b0df40
Bug fixes
...
1. Remove tables/views from old databases in the new cluster, before dropping the dbs
2. Fix id in result_accessroute, indi_impact_measures, indi_pub_bronze_oa
2023-06-14 19:09:09 +03:00
Claudio Atzori
b76a47b103
[aggregator graph] added column alias when mapping organization PIDs from the OpenOrgs database
2023-06-13 11:38:10 +02:00
Claudio Atzori
744a61a030
depending on dhp-schema:3.17.1
2023-06-12 13:49:44 +02:00
Claudio Atzori
2e4616a251
Merge pull request '[graph cleaning] pid cleaning' ( #307 ) from pid_cleaning into beta
...
Reviewed-on: #307
2023-06-12 13:32:29 +02:00
Claudio Atzori
d6a8b24711
Merge branch 'beta' into pid_cleaning
2023-06-12 13:32:22 +02:00
Claudio Atzori
fdbfb25614
Merge pull request 'update sql query to return distinct pids [beta]' ( #306 ) from distinct_pids_from_openorgs_beta into beta
...
Reviewed-on: #306
2023-06-12 09:59:00 +02:00
Claudio Atzori
ad04f14b81
Merge branch 'beta' into distinct_pids_from_openorgs_beta
2023-06-12 09:58:21 +02:00
Claudio Atzori
a98e6591e2
Merge pull request 'propagation of projects through parent-child relations' ( #299 ) from propagationProjectThroughParentChils into beta
...
Reviewed-on: #299
2023-06-12 09:57:20 +02:00
Claudio Atzori
55f002f1e9
Merge branch 'beta' into propagationProjectThroughParentChils
2023-06-12 09:56:53 +02:00
Claudio Atzori
daa21ddbb5
Merge pull request '[aggregator graph] validation for URLs from oaf:fulltext' ( #298 ) from fulltext_url_validation into beta
...
Reviewed-on: #298
2023-06-12 09:55:35 +02:00
Claudio Atzori
4b00a76271
Merge branch 'beta' into fulltext_url_validation
2023-06-12 09:55:25 +02:00