Claudio Atzori
97c9706469
minors
2024-08-02 15:47:56 +02:00
Claudio Atzori
07e7b9315c
code formatting
2024-08-02 14:42:24 +02:00
Alessia
39810c6e7e
Rest collector plugin on hadoop supports a new param to pass request headers
2024-08-02 14:41:43 +02:00
Claudio Atzori
e0f58afd30
[graph provision] include only FoS L1..L2 in the record serialization
2024-08-02 10:58:57 +02:00
Claudio Atzori
60cf7d86a1
[graph provision] include only FoS L1..L2 in the record serialization
2024-08-02 10:58:47 +02:00
Miriam Baglioni
8f11dfe554
[UnpayWall]added othe : in the identifier construction
2024-07-16 18:18:38 +02:00
Claudio Atzori
d20a5e020a
[graph provision] log the Solr admin application operations for alias deletion and creation
2024-07-15 16:31:04 +02:00
Claudio Atzori
3d1d8e6036
renamed workflow to better reflect its purpose
2024-07-15 15:24:18 +02:00
Claudio Atzori
0b1c58358b
Merge pull request '[broker] fixing the mapping of ORCID for the identification of the enrichments' ( #458 ) from broker_orcid into main
...
Reviewed-on: #458
2024-07-15 11:34:01 +02:00
Claudio Atzori
b70a440aca
renamed class, updated criteria to consider the ORCIDs used in the matchers
2024-07-12 17:09:01 +02:00
Michele Artini
36c3df1652
tests
2024-07-12 15:29:45 +02:00
Claudio Atzori
2f13683285
[broker] fine tuned the workflow memory settings
2024-07-12 10:27:24 +02:00
Claudio Atzori
5ab409dcab
[metadata collection] added -Dcom.sun.security.enableAIAcaIssuers=true as a default for metadata collection
2024-07-12 10:26:32 +02:00
Claudio Atzori
b756cfeb85
Merge pull request 'set JAVA_HOME and JAVA_OPTS in metadata collection' ( #457 ) from metadata_collection_java_upgrade into main
...
Reviewed-on: #457
2024-07-11 15:32:11 +02:00
Claudio Atzori
51d6a541bd
[metadata collection] added the possibility to specify the JAVA_HOME and the JAVA_OPTS parameters
2024-07-11 15:24:29 +02:00
Claudio Atzori
07ce92cef2
[OAI-PMH] fixed node name
2024-07-11 11:00:23 +02:00
Miriam Baglioni
f043b7b096
[Irish Tender]changed the irish.json file according to comments #26 , #29 , and #34 for 9635
2024-07-04 12:22:56 +02:00
Claudio Atzori
153b56eeff
make entity level pids unique by pidType:pidValue
2024-07-04 09:41:39 +02:00
Claudio Atzori
ed97ba4565
Merge pull request '[prod] Openaire Affiliation Inference' ( #453 ) from affRoFromRawStringmain into main
...
Reviewed-on: #453
2024-07-03 12:32:26 +02:00
Claudio Atzori
7b398a6d0b
updated import of organization types from OpenOrgs
2024-07-03 11:11:35 +02:00
Claudio Atzori
13f6506ce5
Change the selection criteria for the pivot record of a group so that by best pid type becomes the first criteria. This will have the effect to slowly converge to records having DOI
2024-07-03 10:44:01 +02:00
Claudio Atzori
3d9ddaa23a
importing organization types from OpenOrgs
2024-07-03 10:15:37 +02:00
Claudio Atzori
c06dfdfd86
ignore dates containing 'null's
2024-07-02 15:43:11 +02:00
Claudio Atzori
b822b34abe
code formatting
2024-07-01 09:22:35 +02:00
Michele De Bonis
ea1841fbd2
implementation of countryMatch and addition of workflow parameters
2024-07-01 09:14:32 +02:00
Miriam Baglioni
4dbce39237
[AffiliationInference]Extended the affiliation ingestion from OpenAIRE to include also the links derived from web crawl. Changed the provenance from BIP! to OpenAIRE
2024-06-29 18:51:06 +02:00
Miriam Baglioni
3ee8a7d18a
[WebCrawl]moved to Constants web crawl name and id
2024-06-29 18:47:23 +02:00
Claudio Atzori
ee7deb3f60
[graph provision] publicFormat worfklow parameter defined as optional
2024-06-28 14:52:43 +02:00
Claudio Atzori
157cc8be87
[graph provision] fixed serialization of the instancetypes
2024-06-28 14:21:12 +02:00
Claudio Atzori
023099a921
imported from beta
2024-06-26 11:40:16 +02:00
Claudio Atzori
786c217085
Using the updated Solr JSON payload model classes
2024-06-26 11:11:33 +02:00
Lampros Smyrnaios
c858c02111
- Fix not using the "export HADOOP_USER_NAME" statement in "createPDFsAggregated.sh", which caused permission-issues when creating tables with Impala.
...
- Remove unused "--user" parameter in "impala-shell" calls.
- Code polishing.
2024-06-26 10:11:21 +02:00
Claudio Atzori
8220e27110
Merge pull request 'Align Solr JSON records to the explore portal requirements' ( #448 ) from json_payload into beta_to_master_may2024
...
Reviewed-on: #448
2024-06-25 09:57:40 +02:00
Claudio Atzori
bc993d49c1
Update pom.xml
...
depend on released schema version
2024-06-25 09:57:06 +02:00
Claudio Atzori
1dc7458de2
added JSON payload to the SolrInputDocument, updated unit tests
2024-06-24 14:48:09 +02:00
Claudio Atzori
a7a54aab47
WIP: align Solr JSON records to the explore portal requirements
2024-06-20 15:48:45 +02:00
Miriam Baglioni
eaa00a4199
[IrishFunderList]make changed according to 9635 comment 20, 21, 22 and 23
2024-06-20 12:32:57 +02:00
Claudio Atzori
fb731b6d46
WIP: align Solr JSON records to the explore portal requirements
2024-06-19 15:38:43 +02:00
Miriam Baglioni
b6da35e736
[IrishFunderList]make changed according to 9635 comment 14, 15 and 16
2024-06-19 11:06:58 +02:00
Lampros Smyrnaios
3c9b8de892
Miscellaneous updates to the copying operation to Impala Cluster:
...
- Fix not breaking out of the VIEWS-infinite-loop when the "SHOULD_EXIT_WHOLE_SCRIPT_UPON_ERROR" is set to "false".
- Exit the script when no HDFS-active-node was found, independently of the "SHOULD_EXIT_WHOLE_SCRIPT_UPON_ERROR".
- Fix view_name-recognition in a log-message, by using the more advanced "Perl-Compatible Regular Expressions" in "grep".
- Add error-handling for "compute stats" errors.
2024-06-18 15:59:34 +02:00
Antonis Lempesis
c67ef157d3
filtering out deletedbyinference and invinsible results from accessroute
2024-06-18 15:59:00 +02:00
Lampros Smyrnaios
c23f3031ed
Miscellaneous updates to the copying operation to Impala Cluster:
...
- Show some counts and the elapsed time for various sub-tasks.
- Code polishing.
2024-06-18 15:58:46 +02:00
Claudio Atzori
8ec151aa3d
[graph indexing] comment out setting the JSON payload from the SolrInputDocuments
2024-06-18 15:53:24 +02:00
Claudio Atzori
2636936162
[IE OAI-PMH] fixed oozie wf definition
2024-06-14 11:47:37 +02:00
Miriam Baglioni
ef437a8cdf
[Provision]temporarily removed Json paylod from indexed records (Shadow cannot support it)
2024-06-13 16:48:03 +02:00
Miriam Baglioni
86088ef26e
Merge remote-tracking branch 'origin/beta_to_master_may2024' into beta_to_master_may2024
2024-06-11 17:04:07 +02:00
Miriam Baglioni
143c525343
[WebCrawl]remove relations for pid not doi
2024-06-11 17:03:59 +02:00
Claudio Atzori
c371513d43
[graph resolution] use sparkExecutorMemory to define also the memoryOverhead
2024-06-11 14:21:01 +02:00
Claudio Atzori
71927ca818
avoid NPEs
2024-06-11 12:40:50 +02:00
Giambattista Bloisi
46018dc804
Fix OperationUnsupportedException while merging two Result's contexts due to modification of an immutable collection
2024-06-11 10:39:48 +02:00