Commit Graph

5482 Commits

Author SHA1 Message Date
Claudio Atzori 01958a3e07 [graph provision] addded filter to exclude records marked with datainfo.deletedbyinference = true 2024-07-24 10:00:10 +02:00
Claudio Atzori ceb210993c Merge pull request 'SDG no DOI' (#464) from sdgnodoi into beta
Reviewed-on: #464
2024-07-24 09:59:13 +02:00
Miriam Baglioni 6f1801d7d1 [webcrawl]- 2024-07-23 17:34:48 +02:00
Miriam Baglioni 19806c2ae3 [SDG]fixed switch of methods 2024-07-23 17:12:55 +02:00
Antonis Lempesis d0590e0e49 added latest institutions 2024-07-23 15:17:15 +03:00
Antonis Lempesis 7d2c0a3723 added new institutions 2024-07-23 15:10:17 +03:00
Miriam Baglioni 62649dc5c4 merging with branch beta 2024-07-23 12:50:12 +02:00
Miriam Baglioni 9573bf576d [SDG]added code to ingest also the SDG without DOI 2024-07-23 12:47:57 +02:00
Michele Artini d27e9ea50f added ODF invisible stores in raw_all workflow 2024-07-23 09:56:27 +02:00
Michele De Bonis 4f4c73d65b minor change: addition of missing parameter in sql query 2024-07-22 15:19:02 +02:00
Miriam Baglioni 79985ad197 [Crossref]added mapping for DFG versus the unidentified project [https://support.openaire.eu/issues/9926?next_issue_id=9924&prev_issue_id=9927#note-4] 2024-07-17 18:30:24 +02:00
Claudio Atzori c25b048e12 Merge pull request 'PersonEntity' (#459) from person into beta
Reviewed-on: #459
2024-07-17 12:02:24 +02:00
Claudio Atzori 06e3985b77 merged from beta 2024-07-17 12:01:40 +02:00
Claudio Atzori 83327239de fixed pom definitions, bumped dependency version for the dhp-schema module, removed unnecessary dependencies 2024-07-17 11:58:48 +02:00
Claudio Atzori db9c54c944 Revert "removed legacy actionmanager dependencies"
This reverts commit bb12d0b4df.
2024-07-17 11:27:43 +02:00
Claudio Atzori e39e8bbd47 Merge pull request '[WebCrawlAffiliation]remove from the creation of the action set the relations for pmc and pmid. Only doi are allowed' (#462) from affiliationFromWebCrawlOnlyDOI into beta
Reviewed-on: #462
2024-07-17 11:12:32 +02:00
Claudio Atzori e94ae771ff Merge pull request '[BulkTag]added tagging for the organization relevant for the community.' (#461) from tagOrganization into beta
Reviewed-on: #461
2024-07-17 11:11:52 +02:00
Claudio Atzori 6c98d69215 reverted changed contens under dhp-pace-core 2024-07-17 11:09:37 +02:00
Claudio Atzori 78b5e4bb6f reverted changed contens under dhp-graph-provision 2024-07-17 10:48:20 +02:00
Claudio Atzori 40c5d87645 Merge pull request '[graph provision] entity level contexts' (#460) from entity_contexts into beta
Reviewed-on: #460
2024-07-17 10:43:21 +02:00
Claudio Atzori a65241fcaf Merge pull request 'implementation of the new collector plugin: research_fi' (#456) from research_fi_collector_plugin into beta
Reviewed-on: #456
2024-07-17 10:25:38 +02:00
Claudio Atzori 6665976604 Merge pull request 'Optimizations for the Openorgs Dedup: normalization and inference of strings and implementation of new general-purpose comparators' (#455) from openorgs_optimization into beta
Reviewed-on: #455
2024-07-17 10:25:20 +02:00
Claudio Atzori c99f92efaa Merge pull request '[beta] OpenAIRE Affiliation Inference' (#452) from affRoFromRawString into beta
Reviewed-on: #452
2024-07-17 10:24:39 +02:00
Claudio Atzori f17e1243ba reverted changed contens under dhp-graph-provision 2024-07-17 10:23:50 +02:00
Claudio Atzori 6a19337dab Merge pull request 'removed legacy actionmanager dependencies' (#454) from cleanup_actionmanager_deps into beta
Reviewed-on: #454
2024-07-17 10:20:44 +02:00
Miriam Baglioni d96215cb9b [UnpayWall]added othe : in the identifier construction 2024-07-16 18:17:32 +02:00
Miriam Baglioni 9246bdec1c [WebCrawlAffiliation]remove from the creation of the action set the relations for pmc and pmid. Only doi are allowed 2024-07-16 14:07:37 +02:00
Miriam Baglioni 9d27910144 [BulkTag]added tagging for the organization relevant for the community. Added test. Changed the tagging variables. 2024-07-16 13:48:48 +02:00
Claudio Atzori beb93cdfe9 [graph provision] expand the context info for each entity type 2024-07-16 11:43:48 +02:00
Claudio Atzori 5aa7847ea6 consider the transformative agreement text when merging results 2024-07-16 10:38:50 +02:00
Claudio Atzori 38f8ed27fd [graph provision] log the Solr admin application operations for alias deletion and creation 2024-07-15 16:30:43 +02:00
Claudio Atzori 1fb44198fb renamed workflow to better reflect its purpose 2024-07-15 15:24:38 +02:00
Claudio Atzori 6f6e85ddf4 code formatting 2024-07-15 09:32:04 +02:00
Claudio Atzori 7fa3d51200 renamed class, updated criteria to consider the ORCIDs used in the matchers 2024-07-15 09:18:58 +02:00
Michele Artini f99fb21040 tests 2024-07-15 09:18:46 +02:00
Claudio Atzori e17edb2581 [broker] fine tuned the workflow memory settings 2024-07-12 10:27:50 +02:00
Claudio Atzori 61d1fa9b9f [metadata collection] added -Dcom.sun.security.enableAIAcaIssuers=true as a default for metadata collection 2024-07-12 10:26:45 +02:00
Claudio Atzori f9ed2ae33c [metadata collection] added the possibility to specify the JAVA_HOME and the JAVA_OPTS parameters 2024-07-11 15:32:36 +02:00
Michele Artini bbe52584f7 log message 2024-07-11 15:14:34 +02:00
Michele Artini 5cdba9172b implementeation of the new collector plugin: research_fi 2024-07-10 14:53:13 +02:00
Michele De Bonis 2a36ccb997 optimization of normalization stage in openorgs workflow, implementation of new comparators replacing older versions, openorgs configuration update, addition of inference flag in model definition, new test classes 2024-07-09 16:58:10 +02:00
Miriam Baglioni c465835061 [Person]new implementation for the extraction of the coAuthorship relations 2024-07-09 12:29:55 +02:00
Miriam Baglioni 814e650e12 [Irish Tender]changed the irish.json file according to comments #26, #29, and #34 for 9635 2024-07-04 12:24:28 +02:00
Miriam Baglioni ddd20e7f8e [Person]first implementation of the action set to include Person entity in the graph starting from the orcid data 2024-07-04 12:08:46 +02:00
Claudio Atzori 1180d78b71 make entity level pids unique by pidType:pidValue 2024-07-04 09:41:12 +02:00
Lampros Smyrnaios e9686365a2 Improve performance of creating the "result_fos" table, by using a temp-table to cache data, which is requested multiple times. 2024-07-03 20:24:36 +03:00
Lampros Smyrnaios ce0aee21cc Improve performance of transferring the stats-DBs to another cluster and querying the DBs' tables, by ordering Spark to create up to 100 files per table, instead of thousands. 2024-07-03 20:15:33 +03:00
Lampros Smyrnaios 7b7dd32ad5 - Fix placement of some "set mapred.job.queue.name=analytics" statements and remove their unused "/*EOS*/" indicator.
- Add stacktrace-info to failed actions.
2024-07-03 19:53:24 +03:00
Lampros Smyrnaios 7ce051d766 - Update the remaining hive-actions to spark-actions.
- Update the version of shell-actions.
- Fix missing "/*EOS*/" indicators.
2024-07-03 19:49:19 +03:00
Lampros Smyrnaios aa4d7d5e20 Prioritize the rest of the stats-queries over other tasks on the cluster, by putting them in the "analytics" queue. 2024-07-03 19:14:25 +03:00