Miriam Baglioni
9cbe966b4a
[AffiliationIngestion]refactoring
2024-06-29 18:35:49 +02:00
Miriam Baglioni
236b64d830
[AffiliationIngestion]Extended the ingestion of affiliation from open aire to include also links derived from Web Crawl. Extended the test. Inserted in Constatns the id and name of the webcrawl datasource to be used here and also in the ingestion of links from web crawl
2024-06-29 18:29:20 +02:00
Claudio Atzori
14539f9c8b
[graph provision] publicFormat worfklow parameter defined as optional
2024-06-28 14:55:18 +02:00
Claudio Atzori
1bc8c5d173
[graph provision] fixed serialization of the instancetypes
2024-06-28 14:54:28 +02:00
Claudio Atzori
1ccf01cdb8
Using the updated Solr JSON payload model classes
2024-06-28 12:38:07 +02:00
Claudio Atzori
b79cb155ba
Merge pull request 'Fix permissions-issue in Stats-workflow, step22a-createPDFsAggregated.' ( #450 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #450
2024-06-26 10:11:34 +02:00
Claudio Atzori
33a02c5b9e
Merge pull request 'Change the selection criteria for the pivot record of a group so that by best pid type becomes the first criteria. This will have the effect to converge to records having DOI pid' ( #446 ) from pivotselectionbypid into beta
...
Reviewed-on: #446
2024-06-26 10:10:13 +02:00
Claudio Atzori
1182bca9eb
Merge pull request 'Add support to cretate/update solr collection aliases' ( #449 ) from 9872-create-solr-collection-aliases into beta
...
Reviewed-on: #449
2024-06-26 10:09:51 +02:00
Claudio Atzori
1c30eacac2
updated index feeding procedure to exploit the collection aliases
2024-06-25 15:27:38 +02:00
Claudio Atzori
6055212f77
merged from the json_payload branch
2024-06-25 12:39:02 +02:00
Claudio Atzori
0031cf849e
Merge branch 'beta' into 9872-create-solr-collection-aliases
2024-06-25 09:58:01 +02:00
Serafeim Chatzopoulos
9f6e16a03c
Add support to cretate/update solr collection aliases
2024-06-20 16:03:15 +03:00
Lampros Smyrnaios
66cd28f70a
- Fix not using the "export HADOOP_USER_NAME" statement in "createPDFsAggregated.sh", which caused permission-issues when creating tables with Impala.
...
- Remove unused "--user" parameter in "impala-shell" calls.
- Code polishing.
2024-06-20 14:33:46 +03:00
Lampros Smyrnaios
c6b1ab2a18
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2024-06-20 14:33:05 +03:00
Miriam Baglioni
d35edac212
[IrishFunderList]make changed according to 9635 comment 20, 21, 22 and 23
2024-06-20 12:28:28 +02:00
Miriam Baglioni
6421f8fece
Merge remote-tracking branch 'origin/beta' into beta
2024-06-19 11:12:15 +02:00
Miriam Baglioni
ac270f795b
[IrishFunderList]make changed according to 9635 comment 14, 15 and 16
2024-06-19 11:11:52 +02:00
Lampros Smyrnaios
236aed8954
Merge remote-tracking branch 'origin/beta' into beta
2024-06-18 17:12:35 +03:00
Claudio Atzori
dd541f8cf5
Merge pull request 'Miscellaneous updates to the copying operation to Impala Cluster.' ( #447 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: #447
2024-06-18 15:52:30 +02:00
Lampros Smyrnaios
ff335578ea
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2024-06-18 14:52:31 +03:00
Lampros Smyrnaios
285416c74e
Merge branch 'beta' into beta
2024-06-18 13:50:38 +02:00
Lampros Smyrnaios
3095047e5e
Miscellaneous updates to the copying operation to Impala Cluster:
...
- Fix not breaking out of the VIEWS-infinite-loop when the "SHOULD_EXIT_WHOLE_SCRIPT_UPON_ERROR" is set to "false".
- Exit the script when no HDFS-active-node was found, independently of the "SHOULD_EXIT_WHOLE_SCRIPT_UPON_ERROR".
- Fix view_name-recognition in a log-message, by using the more advanced "Perl-Compatible Regular Expressions" in "grep".
- Add error-handling for "compute stats" errors.
2024-06-18 14:40:41 +03:00
Antonis Lempesis
0456f1b788
Merge remote-tracking branch 'origin/beta' into beta
2024-06-14 15:11:30 +03:00
Antonis Lempesis
38636942c7
filtering out deletedbyinference and invinsible results from accessroute
2024-06-14 15:11:19 +03:00
Lampros Smyrnaios
d942a1101b
Miscellaneous updates to the copying operation to Impala Cluster:
...
- Show some counts and the elapsed time for various sub-tasks.
- Code polishing.
2024-06-14 12:14:38 +03:00
Giambattista Bloisi
9bf2bda1c6
Fix: next returned a null value at end of stream
2024-06-12 13:28:51 +02:00
Giambattista Bloisi
d90cb099b8
Fix for paginationStart parameter management
2024-06-11 20:23:44 +02:00
Giambattista Bloisi
4f2a61e10f
Change the selection criteria for the pivot record of a group so that by best pid type becomes the first criteria. This will have the effect to slowly converge to records having DOI pid
2024-06-11 15:33:56 +02:00
Claudio Atzori
11fe3a4fe0
[graph resolution] use sparkExecutorMemory to define also the memoryOverhead
2024-06-11 14:21:17 +02:00
Claudio Atzori
a8d68c9d29
avoid NPEs
2024-06-11 14:19:24 +02:00
Miriam Baglioni
8fe934810f
Merge remote-tracking branch 'origin/beta' into beta
2024-06-11 10:28:51 +02:00
Miriam Baglioni
9da006e98c
[SDGFoSActionSet]remove datainfo for the result. It is not needed (qualifier.classid = UPDATE) useless since subject do not go at the level of the instance
2024-06-11 10:28:32 +02:00
Giambattista Bloisi
85c1eae7e0
Fixes for pagination strategy looping at end of download
2024-06-10 19:03:58 +02:00
Claudio Atzori
b0eba210c0
[actionset promotion] use sparkExecutorMemory to define also the memoryOverhead
2024-06-10 16:15:24 +02:00
Claudio Atzori
3776327a8c
hostedby patching to work with the updated Crossref contents, resolved conflict
2024-06-10 15:24:12 +02:00
Claudio Atzori
0139f23d66
Merge pull request 'organization type from OpenOrgs' ( #445 ) from import_openorg_type into beta
...
Reviewed-on: #445
2024-06-07 12:17:31 +02:00
Michele Artini
c726572418
changed some parameters in OSF test
2024-06-07 12:03:26 +02:00
Claudio Atzori
ec79405cc9
[graph raw] set organization type from openorgs
2024-06-07 11:30:31 +02:00
Miriam Baglioni
1477406ecc
[bulkTag] fixed issue that made project disappear in graph_10_enriched
2024-06-06 10:45:41 +02:00
Claudio Atzori
92c3abd5a4
[graph cleaning] use sparkExecutorMemory to define also the memoryOverhead
2024-06-06 10:44:33 +02:00
Claudio Atzori
ce2364743a
applying changes from PR#442: Fix for missing collectedfrom after dedup
2024-06-06 10:43:43 +02:00
Claudio Atzori
f70dc76b61
minor
2024-06-06 10:43:10 +02:00
Claudio Atzori
73bd1938a5
[graph2hive] use sparkExecutorMemory to define also the memoryOverhead
2024-06-05 12:17:35 +02:00
Claudio Atzori
da5c1e73a4
Merge pull request 'Irish oaipmh exporter' ( #443 ) from irish-oaipmh-exporter into beta
...
Reviewed-on: #443
2024-06-05 10:55:09 +02:00
Claudio Atzori
81090ad593
[IE OAIPHM] added oozie workflow, minor changes, code formatting
2024-06-05 10:03:33 +02:00
Claudio Atzori
a02f3f0d2b
code formatting
2024-05-30 10:21:18 +02:00
Alessia Bardi
eadfd8d71d
Merge pull request 'Updated XMLIterator for splitting on different nodes' ( #436 ) from dblp_collection_plugin into beta
...
Reviewed-on: #436
2024-05-29 16:05:06 +02:00
Alessia Bardi
05ee783c07
Merge branch 'beta' into dblp_collection_plugin
2024-05-29 16:04:39 +02:00
Alessia Bardi
fe9fb59c90
Merge pull request 'Rest collector plugin on hadoop supports a new param to pass request headers' ( #441 ) from rest-collector-request-header-map into beta
...
Reviewed-on: #441
2024-05-29 15:54:39 +02:00
Claudio Atzori
c272c4ad68
code formatting
2024-05-29 15:50:07 +02:00