1
0
Fork 0
Commit Graph

5488 Commits

Author SHA1 Message Date
sab ef6c90cc64 implemented methods to extract fulltext link from an API call 2024-09-11 14:57:38 +02:00
sab df82f8beb9 code adapted as per Michele's recommendations 2024-09-04 15:29:13 +02:00
sab 53787dbf67 code refactored 2024-08-01 09:52:19 +02:00
sab bbb79273a3 conversion to Dublin Core has been implemented 2024-08-01 01:23:04 +02:00
sab 7f39375ba8 data fetcher has been implemented 2024-07-31 18:05:11 +02:00
Claudio Atzori d20a5e020a [graph provision] log the Solr admin application operations for alias deletion and creation 2024-07-15 16:31:04 +02:00
Claudio Atzori 3d1d8e6036 renamed workflow to better reflect its purpose 2024-07-15 15:24:18 +02:00
Claudio Atzori 0b1c58358b Merge pull request '[broker] fixing the mapping of ORCID for the identification of the enrichments' (#458) from broker_orcid into main
Reviewed-on: D-Net/dnet-hadoop#458
2024-07-15 11:34:01 +02:00
Claudio Atzori b70a440aca renamed class, updated criteria to consider the ORCIDs used in the matchers 2024-07-12 17:09:01 +02:00
Michele Artini 36c3df1652 tests 2024-07-12 15:29:45 +02:00
Claudio Atzori 2f13683285 [broker] fine tuned the workflow memory settings 2024-07-12 10:27:24 +02:00
Claudio Atzori 5ab409dcab [metadata collection] added -Dcom.sun.security.enableAIAcaIssuers=true as a default for metadata collection 2024-07-12 10:26:32 +02:00
Claudio Atzori b756cfeb85 Merge pull request 'set JAVA_HOME and JAVA_OPTS in metadata collection' (#457) from metadata_collection_java_upgrade into main
Reviewed-on: D-Net/dnet-hadoop#457
2024-07-11 15:32:11 +02:00
Claudio Atzori 51d6a541bd [metadata collection] added the possibility to specify the JAVA_HOME and the JAVA_OPTS parameters 2024-07-11 15:24:29 +02:00
Claudio Atzori 07ce92cef2 [OAI-PMH] fixed node name 2024-07-11 11:00:23 +02:00
Miriam Baglioni f043b7b096 [Irish Tender]changed the irish.json file according to comments #26, #29, and #34 for 9635 2024-07-04 12:22:56 +02:00
Claudio Atzori 153b56eeff make entity level pids unique by pidType:pidValue 2024-07-04 09:41:39 +02:00
Claudio Atzori ed97ba4565 Merge pull request '[prod] Openaire Affiliation Inference' (#453) from affRoFromRawStringmain into main
Reviewed-on: D-Net/dnet-hadoop#453
2024-07-03 12:32:26 +02:00
Claudio Atzori 7b398a6d0b updated import of organization types from OpenOrgs 2024-07-03 11:11:35 +02:00
Claudio Atzori 13f6506ce5 Change the selection criteria for the pivot record of a group so that by best pid type becomes the first criteria. This will have the effect to slowly converge to records having DOI 2024-07-03 10:44:01 +02:00
Claudio Atzori 3d9ddaa23a importing organization types from OpenOrgs 2024-07-03 10:15:37 +02:00
Claudio Atzori c06dfdfd86 ignore dates containing 'null's 2024-07-02 15:43:11 +02:00
Claudio Atzori b822b34abe code formatting 2024-07-01 09:22:35 +02:00
Michele De Bonis ea1841fbd2 implementation of countryMatch and addition of workflow parameters 2024-07-01 09:14:32 +02:00
Miriam Baglioni 4dbce39237 [AffiliationInference]Extended the affiliation ingestion from OpenAIRE to include also the links derived from web crawl. Changed the provenance from BIP! to OpenAIRE 2024-06-29 18:51:06 +02:00
Miriam Baglioni 3ee8a7d18a [WebCrawl]moved to Constants web crawl name and id 2024-06-29 18:47:23 +02:00
Claudio Atzori ee7deb3f60 [graph provision] publicFormat worfklow parameter defined as optional 2024-06-28 14:52:43 +02:00
Claudio Atzori 157cc8be87 [graph provision] fixed serialization of the instancetypes 2024-06-28 14:21:12 +02:00
Claudio Atzori 023099a921 imported from beta 2024-06-26 11:40:16 +02:00
Claudio Atzori 786c217085 Using the updated Solr JSON payload model classes 2024-06-26 11:11:33 +02:00
Lampros Smyrnaios c858c02111 - Fix not using the "export HADOOP_USER_NAME" statement in "createPDFsAggregated.sh", which caused permission-issues when creating tables with Impala.
- Remove unused "--user" parameter in "impala-shell" calls.
- Code polishing.
2024-06-26 10:11:21 +02:00
Claudio Atzori 8220e27110 Merge pull request 'Align Solr JSON records to the explore portal requirements' (#448) from json_payload into beta_to_master_may2024
Reviewed-on: D-Net/dnet-hadoop#448
2024-06-25 09:57:40 +02:00
Claudio Atzori bc993d49c1 Update pom.xml
depend on released schema version
2024-06-25 09:57:06 +02:00
Claudio Atzori 1dc7458de2 added JSON payload to the SolrInputDocument, updated unit tests 2024-06-24 14:48:09 +02:00
Claudio Atzori a7a54aab47 WIP: align Solr JSON records to the explore portal requirements 2024-06-20 15:48:45 +02:00
Miriam Baglioni eaa00a4199 [IrishFunderList]make changed according to 9635 comment 20, 21, 22 and 23 2024-06-20 12:32:57 +02:00
Claudio Atzori fb731b6d46 WIP: align Solr JSON records to the explore portal requirements 2024-06-19 15:38:43 +02:00
Miriam Baglioni b6da35e736 [IrishFunderList]make changed according to 9635 comment 14, 15 and 16 2024-06-19 11:06:58 +02:00
Lampros Smyrnaios 3c9b8de892 Miscellaneous updates to the copying operation to Impala Cluster:
- Fix not breaking out of the VIEWS-infinite-loop when the "SHOULD_EXIT_WHOLE_SCRIPT_UPON_ERROR" is set to "false".
- Exit the script when no HDFS-active-node was found, independently of the "SHOULD_EXIT_WHOLE_SCRIPT_UPON_ERROR".
- Fix view_name-recognition in a log-message, by using the more advanced "Perl-Compatible Regular Expressions" in "grep".
- Add error-handling for "compute stats" errors.
2024-06-18 15:59:34 +02:00
Antonis Lempesis c67ef157d3 filtering out deletedbyinference and invinsible results from accessroute 2024-06-18 15:59:00 +02:00
Lampros Smyrnaios c23f3031ed Miscellaneous updates to the copying operation to Impala Cluster:
- Show some counts and the elapsed time for various sub-tasks.
- Code polishing.
2024-06-18 15:58:46 +02:00
Claudio Atzori 8ec151aa3d [graph indexing] comment out setting the JSON payload from the SolrInputDocuments 2024-06-18 15:53:24 +02:00
Claudio Atzori 2636936162 [IE OAI-PMH] fixed oozie wf definition 2024-06-14 11:47:37 +02:00
Miriam Baglioni ef437a8cdf [Provision]temporarily removed Json paylod from indexed records (Shadow cannot support it) 2024-06-13 16:48:03 +02:00
Miriam Baglioni 86088ef26e Merge remote-tracking branch 'origin/beta_to_master_may2024' into beta_to_master_may2024 2024-06-11 17:04:07 +02:00
Miriam Baglioni 143c525343 [WebCrawl]remove relations for pid not doi 2024-06-11 17:03:59 +02:00
Claudio Atzori c371513d43 [graph resolution] use sparkExecutorMemory to define also the memoryOverhead 2024-06-11 14:21:01 +02:00
Claudio Atzori 71927ca818 avoid NPEs 2024-06-11 12:40:50 +02:00
Giambattista Bloisi 46018dc804 Fix OperationUnsupportedException while merging two Result's contexts due to modification of an immutable collection 2024-06-11 10:39:48 +02:00
Miriam Baglioni 3efd5b1308 [SDGActionSet]remove datainfo for the result. It is not needed (qualifier.classid = UPDATE) useless since subject do not go at the level of the instance 2024-06-11 10:35:57 +02:00