Lampros Smyrnaios
66cd28f70a
- Fix not using the "export HADOOP_USER_NAME" statement in "createPDFsAggregated.sh", which caused permission-issues when creating tables with Impala.
...
- Remove unused "--user" parameter in "impala-shell" calls.
- Code polishing.
2024-06-20 14:33:46 +03:00
Lampros Smyrnaios
c6b1ab2a18
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2024-06-20 14:33:05 +03:00
Miriam Baglioni
d35edac212
[IrishFunderList]make changed according to 9635 comment 20, 21, 22 and 23
2024-06-20 12:28:28 +02:00
Miriam Baglioni
6421f8fece
Merge remote-tracking branch 'origin/beta' into beta
2024-06-19 11:12:15 +02:00
Miriam Baglioni
ac270f795b
[IrishFunderList]make changed according to 9635 comment 14, 15 and 16
2024-06-19 11:11:52 +02:00
Lampros Smyrnaios
236aed8954
Merge remote-tracking branch 'origin/beta' into beta
2024-06-18 17:12:35 +03:00
Claudio Atzori
dd541f8cf5
Merge pull request 'Miscellaneous updates to the copying operation to Impala Cluster.' ( #447 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: D-Net/dnet-hadoop#447
2024-06-18 15:52:30 +02:00
Lampros Smyrnaios
ff335578ea
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2024-06-18 14:52:31 +03:00
Lampros Smyrnaios
285416c74e
Merge branch 'beta' into beta
2024-06-18 13:50:38 +02:00
Lampros Smyrnaios
3095047e5e
Miscellaneous updates to the copying operation to Impala Cluster:
...
- Fix not breaking out of the VIEWS-infinite-loop when the "SHOULD_EXIT_WHOLE_SCRIPT_UPON_ERROR" is set to "false".
- Exit the script when no HDFS-active-node was found, independently of the "SHOULD_EXIT_WHOLE_SCRIPT_UPON_ERROR".
- Fix view_name-recognition in a log-message, by using the more advanced "Perl-Compatible Regular Expressions" in "grep".
- Add error-handling for "compute stats" errors.
2024-06-18 14:40:41 +03:00
Antonis Lempesis
0456f1b788
Merge remote-tracking branch 'origin/beta' into beta
2024-06-14 15:11:30 +03:00
Antonis Lempesis
38636942c7
filtering out deletedbyinference and invinsible results from accessroute
2024-06-14 15:11:19 +03:00
Lampros Smyrnaios
d942a1101b
Miscellaneous updates to the copying operation to Impala Cluster:
...
- Show some counts and the elapsed time for various sub-tasks.
- Code polishing.
2024-06-14 12:14:38 +03:00
Giambattista Bloisi
9bf2bda1c6
Fix: next returned a null value at end of stream
2024-06-12 13:28:51 +02:00
Giambattista Bloisi
d90cb099b8
Fix for paginationStart parameter management
2024-06-11 20:23:44 +02:00
Claudio Atzori
11fe3a4fe0
[graph resolution] use sparkExecutorMemory to define also the memoryOverhead
2024-06-11 14:21:17 +02:00
Claudio Atzori
a8d68c9d29
avoid NPEs
2024-06-11 14:19:24 +02:00
Miriam Baglioni
8fe934810f
Merge remote-tracking branch 'origin/beta' into beta
2024-06-11 10:28:51 +02:00
Miriam Baglioni
9da006e98c
[SDGFoSActionSet]remove datainfo for the result. It is not needed (qualifier.classid = UPDATE) useless since subject do not go at the level of the instance
2024-06-11 10:28:32 +02:00
Giambattista Bloisi
85c1eae7e0
Fixes for pagination strategy looping at end of download
2024-06-10 19:03:58 +02:00
Claudio Atzori
b0eba210c0
[actionset promotion] use sparkExecutorMemory to define also the memoryOverhead
2024-06-10 16:15:24 +02:00
Claudio Atzori
3776327a8c
hostedby patching to work with the updated Crossref contents, resolved conflict
2024-06-10 15:24:12 +02:00
Claudio Atzori
0139f23d66
Merge pull request 'organization type from OpenOrgs' ( #445 ) from import_openorg_type into beta
...
Reviewed-on: D-Net/dnet-hadoop#445
2024-06-07 12:17:31 +02:00
Michele Artini
c726572418
changed some parameters in OSF test
2024-06-07 12:03:26 +02:00
Claudio Atzori
ec79405cc9
[graph raw] set organization type from openorgs
2024-06-07 11:30:31 +02:00
Miriam Baglioni
1477406ecc
[bulkTag] fixed issue that made project disappear in graph_10_enriched
2024-06-06 10:45:41 +02:00
Claudio Atzori
92c3abd5a4
[graph cleaning] use sparkExecutorMemory to define also the memoryOverhead
2024-06-06 10:44:33 +02:00
Claudio Atzori
ce2364743a
applying changes from PR#442: Fix for missing collectedfrom after dedup
2024-06-06 10:43:43 +02:00
Claudio Atzori
f70dc76b61
minor
2024-06-06 10:43:10 +02:00
Claudio Atzori
73bd1938a5
[graph2hive] use sparkExecutorMemory to define also the memoryOverhead
2024-06-05 12:17:35 +02:00
Claudio Atzori
da5c1e73a4
Merge pull request 'Irish oaipmh exporter' ( #443 ) from irish-oaipmh-exporter into beta
...
Reviewed-on: D-Net/dnet-hadoop#443
2024-06-05 10:55:09 +02:00
Claudio Atzori
81090ad593
[IE OAIPHM] added oozie workflow, minor changes, code formatting
2024-06-05 10:03:33 +02:00
Claudio Atzori
a02f3f0d2b
code formatting
2024-05-30 10:21:18 +02:00
Alessia Bardi
eadfd8d71d
Merge pull request 'Updated XMLIterator for splitting on different nodes' ( #436 ) from dblp_collection_plugin into beta
...
Reviewed-on: D-Net/dnet-hadoop#436
2024-05-29 16:05:06 +02:00
Alessia Bardi
05ee783c07
Merge branch 'beta' into dblp_collection_plugin
2024-05-29 16:04:39 +02:00
Alessia Bardi
fe9fb59c90
Merge pull request 'Rest collector plugin on hadoop supports a new param to pass request headers' ( #441 ) from rest-collector-request-header-map into beta
...
Reviewed-on: D-Net/dnet-hadoop#441
2024-05-29 15:54:39 +02:00
Claudio Atzori
c272c4ad68
code formatting
2024-05-29 15:50:07 +02:00
Alessia Bardi
c5f4da16a4
Merge branch 'beta' into rest-collector-request-header-map
2024-05-29 15:46:23 +02:00
Alessia
1b165a14a0
Rest collector plugin on hadoop supports a new param to pass request headers
2024-05-29 15:41:36 +02:00
Michele Artini
e996787be2
OSF test
2024-05-29 15:05:17 +02:00
Claudio Atzori
62716141c5
Merge pull request 'Miscellaneous updates to the copying operation to Impala Cluster' ( #440 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: D-Net/dnet-hadoop#440
2024-05-29 14:34:51 +02:00
Miriam Baglioni
5d85b70e1f
[NOAMI] removed Ireland funder id 501100011103. ticket 9635
2024-05-29 11:55:00 +02:00
Lampros Smyrnaios
e3f28338c1
Miscellaneous updates to the copying operation to Impala Cluster:
...
- Assign the WRITE and EXECUTE permissions to the DBs' HDFS-directories, in order to be able to create tables on top of them, in the Impala Cluster.
- Make sure the "copydb" function returns early, when it encounters a fatal error, while respecting the "SHOULD_EXIT_WHOLE_SCRIPT_UPON_ERROR" config.
2024-05-28 17:51:45 +03:00
Giambattista Bloisi
73316d8c83
Add jaxb and jaxws dependencies when compiling with spark-34 profile as they are required to run with jdk > 8
2024-05-28 14:14:51 +02:00
Miriam Baglioni
75d5ddb999
Update to include a blackList that filters out the results we know are wrongly associated to IE - update workflow definition - the blacklist parameter
2024-05-27 12:01:28 +02:00
Miriam Baglioni
87c9c61b41
Update to include a blackList that filters out the results we know are wrongly associated to IE - refactoring
2024-05-27 12:01:16 +02:00
Miriam Baglioni
b55fed09f8
Update to include a blackList that filters out the results we know are wrongly associated to IE
2024-05-27 12:01:01 +02:00
Claudio Atzori
107d958b89
[org dedup] avoid NPEs in SparkPrepareNewOrgs
2024-05-27 11:59:54 +02:00
Claudio Atzori
3a7a6ecc32
[org dedup] avoid NPEs in SparkPrepareOrgRels
2024-05-27 11:59:45 +02:00
Claudio Atzori
1af4224d3d
[org dedup] avoid NPEs in SparkPrepareOrgRels
2024-05-27 11:59:33 +02:00