Claudio Atzori
abd7ca0c18
Merge branch 'beta' into bulkTagRefactor
2023-05-02 10:50:01 +02:00
Claudio Atzori
de11edca98
Merge branch 'beta' into organizationToRepresentative
2023-05-02 09:59:41 +02:00
Miriam Baglioni
efc4f6a658
[bulkTag] refactor to enrich each result single step
2023-04-18 17:39:31 +02:00
Miriam Baglioni
697a134504
-
2023-04-18 10:21:12 +02:00
Miriam Baglioni
6cc95c96a2
-
2023-04-18 09:53:11 +02:00
Miriam Baglioni
932d07d2dd
[bulkTag] added filtering for datasources in eosctag
2023-04-06 15:08:27 +02:00
Miriam Baglioni
c6a7602b3e
refactoring after compilation
2023-04-06 14:45:01 +02:00
Miriam Baglioni
831055a1fc
change of the property for test purposes, addition of two new verbs, and fix of issue for advanced constraints
2023-04-06 14:41:32 +02:00
Miriam Baglioni
287753417d
better implementation for the fix
2023-04-06 12:22:38 +02:00
Miriam Baglioni
cf3d0f4f83
fixed issue on bulktagging for the advanced constraints
2023-04-06 12:17:35 +02:00
Miriam Baglioni
b42abc9904
fixed issue on bulktagging for the advanced constraints
2023-04-06 12:15:00 +02:00
Miriam Baglioni
ecc05fe0f3
Added the code for the advancedConstraint implementation during the bulkTagging
2023-04-05 16:40:29 +02:00
Miriam Baglioni
b25b401065
added test to verify the advconstraints to dth community. inserted some additional logs.
2023-04-05 12:18:39 +02:00
Claudio Atzori
d05ca53a14
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2023-01-31 14:39:53 +01:00
Miriam Baglioni
e82e009b46
added missing close tag for XML produced by the xquery to get information for the community from the IS
2023-01-31 10:19:34 +01:00
Miriam Baglioni
b254a0375f
[Affiliation from institutionalrepo] changed the field to check to verify the datasource type. Now it is in the field jurisdiction
2023-01-26 16:51:20 +01:00
Claudio Atzori
505867bce9
[bulk tagging] better node naming
2023-01-20 16:13:16 +01:00
Claudio Atzori
1b37516578
[bulk tagging] better node naming
2023-01-20 16:11:26 +01:00
Miriam Baglioni
ecd398fe51
refactoring
2023-01-20 14:23:45 +01:00
Claudio Atzori
3800361033
[country propagation] fixes error 'cannot resolve countrySet given input columns: []' when there is no prepared information driving the propagation process for a given result type
2023-01-19 15:57:43 +01:00
Miriam Baglioni
8893389895
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2022-12-21 12:42:27 +01:00
Claudio Atzori
6aa91204a5
[orcid propagation] skip empty directories
2022-12-20 14:15:46 +01:00
Claudio Atzori
5816ded93f
code formatting
2022-12-20 10:41:40 +01:00
Claudio Atzori
46972f8393
[orcid propagation] skip empty directory
2022-12-20 10:28:22 +01:00
Miriam Baglioni
6674cccb94
[BulkTag] description of parameters more comprehensive for those who do not implement it
2022-12-16 15:33:20 +01:00
Miriam Baglioni
f37113a941
[BulkTag] moving xquery to get community configuration in dedicated file
2022-12-16 15:32:26 +01:00
Miriam Baglioni
3d99b78d94
[Cleaning] fixed error in parameter (workingPath to workingDir)
2022-12-08 10:25:02 +01:00
Claudio Atzori
1b8488976b
code formatting
2022-12-07 10:45:38 +01:00
Claudio Atzori
cd1b58483e
[bulk tag] fixed Community configuration parsing to void NPE
2022-12-07 10:39:00 +01:00
Miriam Baglioni
bb0ddc1c44
[BulkTag] adding verb starts_with
2022-11-30 09:56:24 +01:00
Miriam Baglioni
9c70c5dbd6
[Bulk Tag horizontal] added new path in definition of constraint (to recognize fos subjects) - changed test and resource class to test this new aspect
2022-11-28 14:51:20 +01:00
Miriam Baglioni
0628df7a3a
resolving conflicts
2022-11-28 10:44:56 +01:00
Miriam Baglioni
33a2b1b5dc
[Bulk Tag] fixed typo in test configuration
2022-11-23 11:31:17 +01:00
Miriam Baglioni
c6df8327b3
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
2022-11-23 11:26:57 +01:00
Miriam Baglioni
0e3edc5018
[Bulk Tag] fixed issue in verb name
2022-11-23 11:26:36 +01:00
Miriam Baglioni
935aa367d8
[BulkTag] removed commented code
2022-11-23 11:16:39 +01:00
Miriam Baglioni
43aedbdfe5
[BulkTag] changed verb name in configuration
2022-11-23 11:14:23 +01:00
Miriam Baglioni
b6da9b67ff
[BulkTag] fixed typo in annotation for verb name
2022-11-23 11:13:58 +01:00
Claudio Atzori
a34c8b6f81
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
2022-11-22 10:22:31 +01:00
Miriam Baglioni
122e75aa17
fixed conflicts
2022-11-21 18:13:12 +01:00
Miriam Baglioni
cee7a45b1d
[Bulk Tag Datasource] fixed issue with verb name and add new test for neanias selection for orcid
2022-11-21 18:10:20 +01:00
Claudio Atzori
ed64618235
increased spark.sql.shuffle.partitions in the last join phase of the result (publication) to community through semantic relation propagation
2022-11-18 16:06:51 +01:00
Claudio Atzori
8742934843
added spark.sql.shuffle.partitions in the last join phase of the result to community through semantic relation propagation
2022-11-18 11:32:22 +01:00
Claudio Atzori
13cc592f39
code formatting
2022-11-15 09:37:57 +01:00
Claudio Atzori
af15b1e48d
[eosc tag] extending criteria for Jupyter Notebook (adding to ORP the same constraint)
2022-11-14 18:30:43 +01:00
Miriam Baglioni
5f9383b2d9
[EOSC TAG] remove reduntant check for jupyter notebook
2022-11-11 14:06:19 +01:00
Miriam Baglioni
b18bbca8af
[EOSC TAG] adding search in orp for jupyter notebook criteria
2022-11-11 12:42:58 +01:00
Claudio Atzori
bca4a61710
suppressing hyper verbose spark logs during unit test execution
2022-10-19 15:20:58 +02:00
Miriam Baglioni
a653e1b3ea
[Enrichment - result to community through organization] reimplementation of the data preparation step using spark
2022-10-04 15:01:28 +02:00
Miriam Baglioni
f1d7d45cf7
[BulkTag] fixed issue
2022-09-28 12:01:43 +02:00
Miriam Baglioni
3ec044600d
[BulkTag] fixed conflicts
2022-09-28 11:58:28 +02:00
Miriam Baglioni
1cb79719a7
[BulkTag] fixed issues
2022-09-28 11:44:55 +02:00
Claudio Atzori
57dbeb08d2
code formatting
2022-09-27 14:55:10 +02:00
Miriam Baglioni
ca216a92ad
[BulkTagging] changed the query to the IS to insert values for FOS and SDG as subject in the configuration used for the tagging
2022-09-23 17:06:07 +02:00
Miriam Baglioni
3e6b0f58bb
[BulkTagging] changed the query to the IS to get also the information for the advancedConstraint from the profile
2022-09-23 16:47:19 +02:00
Miriam Baglioni
55da4d8715
[BulkTagging] modifying code to represent constraints horizontally on all the results. Added subject to the set of field used to express the constraint. Modified resorces to test the new approach. Modified test calss
2022-09-23 16:02:19 +02:00
Miriam Baglioni
960cb861a0
refactoring
2022-09-23 11:14:04 +02:00
Miriam Baglioni
869e129288
[EOSC BulkTag] refactoring
2022-09-20 16:13:18 +02:00
Miriam Baglioni
840465958b
[EOSC BulkTag] filtering aout the datasources registered in the eosc with compatibility different from 3.0, 4.0 for literature, data and CRIS to add the context eosc to the results
2022-09-20 10:30:41 +02:00
Miriam Baglioni
1329aa8479
[EOSC BulkTag] modified test to remove association of result to eosc when eoscifguidelines are set
2022-09-19 11:59:48 +02:00
Miriam Baglioni
a0ee1a8640
[EOSC BulkTag] remove addition of eosc context for result with eosc if guidelines set
2022-09-19 11:44:10 +02:00
Miriam Baglioni
5240ac3d7b
[EOSC Tag] remove addition of eosc context for result with eosc if guidelines set
2022-09-19 11:02:18 +02:00
Claudio Atzori
6c0fd9284b
merge from beta
2022-08-05 10:42:53 +02:00
Miriam Baglioni
a7a18d7630
[Graph Dump] removed code for the dump from the project. Fixed issues in tests when possible
2022-08-04 17:40:40 +02:00
Claudio Atzori
27a91841e7
WIP: cleaning of subjects
2022-08-04 11:39:39 +02:00
Claudio Atzori
0727f0ef48
[EOSC tag] avoid NPEs
2022-07-29 11:55:34 +02:00
Miriam Baglioni
3329b6ce6b
[EOSC TAG] added fix for NPE on subjects
2022-07-29 10:54:20 +02:00
Miriam Baglioni
35bcd9422d
[EOSC Context Tagging] removed not needed specification in path
2022-07-25 15:45:22 +02:00
Miriam Baglioni
1c82acb168
[EOSC Context Tagging] refactoring: moved EOSC IF tagging in package eosc under bulkTag
2022-07-25 14:26:39 +02:00
Miriam Baglioni
68cb637832
merge with branch beta
2022-07-25 14:24:25 +02:00
Miriam Baglioni
0172bab251
[EOSC Context Tagging] refactoring
2022-07-25 14:16:45 +02:00
Miriam Baglioni
144c103b67
[EOSC Context Tagging] add check to avoid the insertion of the context if already present
2022-07-25 13:52:45 +02:00
Miriam Baglioni
d091866e48
[EOSC Context Tagging] refactoring
2022-07-25 11:12:22 +02:00
Miriam Baglioni
06a95daf60
[EOSC context TAG] refactoring after compilation
2022-07-22 14:57:06 +02:00
Miriam Baglioni
627332526b
[EOSC context TAG] workflow start from reset_outputpath action
2022-07-22 14:55:11 +02:00
Miriam Baglioni
7a1c1b6f53
[EOSC context TAG] Add test class and resourcesK
2022-07-22 14:36:02 +02:00
Miriam Baglioni
317a4a56ef
[EOSC context TAG] first implementation of the logic to tag results imported from datasources registered in the EOSC
2022-07-21 17:37:48 +02:00
Miriam Baglioni
3be036f290
[EOSC TAG] refactoring after compilation
2022-07-21 14:45:43 +02:00
Miriam Baglioni
56d09e6348
[EOSC TAG] before adding the tag added a step to verify the same tag is not already present
2022-07-21 14:36:48 +02:00
Miriam Baglioni
5143a80232
[EOSC TAG] modification of test class to align with new element
2022-07-21 11:56:51 +02:00
Miriam Baglioni
438abdf96f
[EOSC TAG] adding eosc interoperability guidelines in the specific element in the result. Removed from subjects. Removed also the deletion of EOSC Jupyter Notebook from subject since now the criteria are searchd for in a different place
2022-07-20 18:07:54 +02:00
Claudio Atzori
1138b2ac8e
code formatting
2022-07-19 14:15:49 +02:00
Miriam Baglioni
fae681fea1
[Country Propagation] add check to avoid NPE on datasource.getDatasourceType().getClassis()
2022-07-03 17:39:58 +02:00
Claudio Atzori
0cb1c70788
code formatting
2022-07-01 10:44:08 +02:00
Miriam Baglioni
5e0b8f9b5f
[CountryPropagation] refactoring
2022-05-20 09:15:53 +02:00
Miriam Baglioni
c298c148cb
[CountryPropagation] fix NPE issue
2022-05-20 09:11:46 +02:00
Miriam Baglioni
f5207885e3
[EOSCTag] changed code to remove EOSC Jupyter Notebook and modified test to exclude galaxy + software from the tagging for Galaxy
2022-05-17 15:09:22 +02:00
Miriam Baglioni
e4eac1d20b
[EOSC TAG] added code to remove EOSC Jupyter Notebook from subjects and put EOSC as classid in the qualifier
2022-05-13 11:01:33 +02:00
Miriam Baglioni
8a72de4011
[EOSCTag] modified workflow to execute all the steps and not only the last one
2022-05-04 10:10:56 +02:00
Miriam Baglioni
3aeedd931a
[EOSCTag] fixed issue in case description is null. Modified test resources and classes
2022-05-04 10:06:38 +02:00
Miriam Baglioni
a21fe310e5
[EOSCTag] last test and change in the implementation to search in title and descriptio
2022-05-02 17:43:20 +02:00
Miriam Baglioni
e342ec93f0
[EOSCTag] prepared resources for test
2022-04-22 18:35:37 +02:00
Miriam Baglioni
88562c0930
[EOSC TAG] added test for galaxy for title and description criterias
2022-04-22 18:35:03 +02:00
Miriam Baglioni
dfbd2bcbea
[EOSC TAG] added logic in case subject is null
2022-04-22 18:34:03 +02:00
Miriam Baglioni
27c85e901a
[EOSCTag] added resources and finalized test for Jupyter Notebook tagging
2022-04-22 17:38:10 +02:00
Miriam Baglioni
bbb77052d3
[EOSCTag] first test
2022-04-22 11:32:57 +02:00
Miriam Baglioni
7cb7066472
[EoscTag] first "rough" implementation
2022-04-22 10:44:17 +02:00
Miriam Baglioni
6dc68c48e0
[EOSCTag] -
2022-04-21 16:19:04 +02:00
Miriam Baglioni
d012d125d7
[EOSCTag] -
2022-04-21 12:02:09 +02:00
Miriam Baglioni
c5a863132c
[BulkTagging] revert it
2022-04-14 14:14:13 +02:00
Miriam Baglioni
8e8933d41a
[BulkTagging] added fix if result.dataInfo is null
2022-04-14 09:04:24 +02:00
Claudio Atzori
48b580b45c
[graph enrichment] fixed country_propagation oozie workflow definition, parameter saveGraph is not needed anymore by the SparkCountryPropagationJob
2022-04-11 08:52:36 +02:00
Claudio Atzori
21f32b83c6
[graph enrichment] fixed country_propagation oozie workflow definition, parameter saveGraph is not needed anymore by the SparkCountryPropagationJob
2022-04-11 08:52:12 +02:00
Claudio Atzori
c26222623f
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:32:22 +02:00
Claudio Atzori
86585a6b27
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:32:19 +02:00
Claudio Atzori
ad85d88eaf
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 13:28:35 +02:00
Claudio Atzori
598e11dfd7
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:27:02 +02:00
Claudio Atzori
db3d9877a5
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:26:58 +02:00
Claudio Atzori
3bba6d6e38
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 12:23:17 +02:00
Claudio Atzori
2ac2d928bd
[maven-release-plugin] prepare for next development iteration
2022-04-07 12:18:47 +02:00
Claudio Atzori
85bc722ff4
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 12:18:43 +02:00
Claudio Atzori
bc05b6168a
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 11:49:06 +02:00
Claudio Atzori
505420fd61
[maven-release-plugin] prepare for next development iteration
2022-04-07 11:34:06 +02:00
Claudio Atzori
66e718981e
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 11:34:02 +02:00
Miriam Baglioni
7b8f85692e
[Enrichment country] fixed issues with parameters and workflow args
2022-03-23 17:20:23 +01:00
Claudio Atzori
f10066547b
increased spark.sql.shuffle.partitions in affiliation_from_semrel_propagation
2022-03-23 12:22:26 +01:00
Claudio Atzori
f430029596
cleanup
2022-03-11 14:28:28 +01:00
Miriam Baglioni
12de9acb0d
[Country Propagation] left out from previous commit
2022-03-11 14:17:02 +01:00
Miriam Baglioni
4437f9345d
[Country Propagation] left out from previous commit
2022-03-11 13:57:47 +01:00
Miriam Baglioni
2b643059fa
[Country Propagation] changed the logic to get the collectedfrom at the result level. To fix issue when no instance is created for a result that should have the country associated. Change the code to use spark instead of hive to prepare the data needed for the propagation step. Added new tests for the intermediate steps and new verification for the propagation itself
2022-03-11 13:56:48 +01:00
Miriam Baglioni
f5b0a6f89c
[master to beta] fixed issues in test files
2022-02-25 10:21:57 +01:00
Miriam Baglioni
37784209c9
[dhp-schemas-] updated the version of dhp-schema to 2.10.27 for APC name and id modification
2022-02-02 12:46:31 +01:00
Miriam Baglioni
dce7f5fea8
[BULK TAGGING] changed to fix issue that should have been fixed already
2022-01-31 08:20:28 +01:00
Miriam Baglioni
064f9bbd87
[AFFPropSR] added new paprameter for the number of iterations and new code for just one iteration
2022-01-07 18:58:51 +01:00
Sandro La Bruzzo
3920d68992
Fixed workflow generation of delta in datacite
2021-12-21 11:41:49 +01:00
Claudio Atzori
1790fa2d44
Merge branch 'beta' into affiliationPropagation
2021-12-14 15:26:56 +01:00
Miriam Baglioni
2bbece2ca5
mergin with branch beta
2021-11-16 16:35:40 +01:00
Sandro La Bruzzo
2d67020c59
added dhp-enrichment maven site template
2021-11-16 16:01:08 +01:00
Miriam Baglioni
28ea532ece
[Affilaition Propagation] moved the selection of graph relation as a preparation step
2021-11-16 15:24:19 +01:00
Miriam Baglioni
7c96e3fd46
removed not useful dir
2021-11-16 13:57:26 +01:00
Miriam Baglioni
c7c0c3187b
[AFFILIATION PROPAGATION] Applied some SonarLint suggestions
2021-11-16 13:56:32 +01:00
Miriam Baglioni
935062edec
[Bypass Action Set] creation of unresolved entities
2021-11-11 16:11:25 +01:00
Miriam Baglioni
c371b23077
-
2021-11-10 17:00:37 +01:00
Miriam Baglioni
9e214ce0eb
[BypassAS] addition of OC relations
2021-11-09 12:07:19 +01:00
Miriam Baglioni
6f7ca539c6
[BypassAS] update of results for bipFinder and FOS
2021-11-09 11:25:41 +01:00
Miriam Baglioni
a7d50c499b
[BypassAS] prepare FOS subject, test and model for FOS and BipFinder scores
2021-11-08 16:44:19 +01:00
Miriam Baglioni
b9d124bb7c
[Enrichment: Propagation through parent-child relationships] Added counters, and changed constraint to verify if filtering out the relation (from classname = harvested to classid != propagation)
2021-11-03 13:55:37 +01:00
Miriam Baglioni
09f36cffb8
[Enrichment: Propagation through parent-child relationships] First implementation, testing, and wf for propagation of result to organization through semantic relation
2021-10-29 11:20:03 +02:00
Miriam Baglioni
d0ef7d91c5
adding test resource
2021-10-26 17:34:11 +02:00
Miriam Baglioni
652114c641
[affiliationPropagation] first try. preparetion
2021-10-20 11:44:23 +02:00
Sandro La Bruzzo
5606014b17
code refactor see ticket #7065
2021-10-12 08:11:53 +02:00
Miriam Baglioni
e9ccdf853f
related to D-Net/dnet-hadoop#132
2021-09-15 18:44:54 +02:00
Miriam Baglioni
5f674efb0c
moved dependency version in external pom
2021-08-13 10:07:53 +02:00
Claudio Atzori
2ee21da43b
suggestions from SonarLint
2021-08-11 12:13:22 +02:00
Claudio Atzori
741077dbca
Merge pull request 'Fix in Affiliation Propagation' ( #113 ) from miriam.baglioni/dnet-hadoop:master into stable_ids
...
Reviewed-on: D-Net/dnet-hadoop#113
2021-06-09 18:42:42 +02:00
Miriam Baglioni
32b0c27217
Aggiornare 'dhp-workflows/dhp-enrichment/src/main/java/eu/dnetlib/dhp/resulttoorganizationfrominstrepo/PrepareResultInstRepoAssociation.java'
...
fix in SQL query: while writing the blacklist constraint it used d.id to indicate the datasource id, but no alias for the datasource was defined. So I removed the alias
2021-06-09 18:36:11 +02:00
Miriam Baglioni
dc07f1079b
added check in case the author set to be enriched is null
2021-06-08 12:06:10 +02:00
Claudio Atzori
b695932ae4
integrated pull#108
2021-05-20 15:34:04 +02:00
Miriam Baglioni
02b80cf24f
resolved conflicts
2021-05-20 10:59:39 +02:00
Claudio Atzori
23b8883ab1
applied intellij code cleanup
2021-05-14 10:58:12 +02:00