Miriam Baglioni
3329b6ce6b
[EOSC TAG] added fix for NPE on subjects
2022-07-29 10:54:20 +02:00
Miriam Baglioni
35bcd9422d
[EOSC Context Tagging] removed not needed specification in path
2022-07-25 15:45:22 +02:00
Miriam Baglioni
1c82acb168
[EOSC Context Tagging] refactoring: moved EOSC IF tagging in package eosc under bulkTag
2022-07-25 14:26:39 +02:00
Miriam Baglioni
68cb637832
merge with branch beta
2022-07-25 14:24:25 +02:00
Miriam Baglioni
0172bab251
[EOSC Context Tagging] refactoring
2022-07-25 14:16:45 +02:00
Miriam Baglioni
144c103b67
[EOSC Context Tagging] add check to avoid the insertion of the context if already present
2022-07-25 13:52:45 +02:00
Miriam Baglioni
d091866e48
[EOSC Context Tagging] refactoring
2022-07-25 11:12:22 +02:00
Miriam Baglioni
06a95daf60
[EOSC context TAG] refactoring after compilation
2022-07-22 14:57:06 +02:00
Miriam Baglioni
627332526b
[EOSC context TAG] workflow start from reset_outputpath action
2022-07-22 14:55:11 +02:00
Miriam Baglioni
7a1c1b6f53
[EOSC context TAG] Add test class and resourcesK
2022-07-22 14:36:02 +02:00
Miriam Baglioni
317a4a56ef
[EOSC context TAG] first implementation of the logic to tag results imported from datasources registered in the EOSC
2022-07-21 17:37:48 +02:00
Miriam Baglioni
3be036f290
[EOSC TAG] refactoring after compilation
2022-07-21 14:45:43 +02:00
Miriam Baglioni
56d09e6348
[EOSC TAG] before adding the tag added a step to verify the same tag is not already present
2022-07-21 14:36:48 +02:00
Miriam Baglioni
5143a80232
[EOSC TAG] modification of test class to align with new element
2022-07-21 11:56:51 +02:00
Miriam Baglioni
438abdf96f
[EOSC TAG] adding eosc interoperability guidelines in the specific element in the result. Removed from subjects. Removed also the deletion of EOSC Jupyter Notebook from subject since now the criteria are searchd for in a different place
2022-07-20 18:07:54 +02:00
Claudio Atzori
1138b2ac8e
code formatting
2022-07-19 14:15:49 +02:00
Miriam Baglioni
fae681fea1
[Country Propagation] add check to avoid NPE on datasource.getDatasourceType().getClassis()
2022-07-03 17:39:58 +02:00
Claudio Atzori
0cb1c70788
code formatting
2022-07-01 10:44:08 +02:00
Miriam Baglioni
5e0b8f9b5f
[CountryPropagation] refactoring
2022-05-20 09:15:53 +02:00
Miriam Baglioni
c298c148cb
[CountryPropagation] fix NPE issue
2022-05-20 09:11:46 +02:00
Miriam Baglioni
f5207885e3
[EOSCTag] changed code to remove EOSC Jupyter Notebook and modified test to exclude galaxy + software from the tagging for Galaxy
2022-05-17 15:09:22 +02:00
Miriam Baglioni
e4eac1d20b
[EOSC TAG] added code to remove EOSC Jupyter Notebook from subjects and put EOSC as classid in the qualifier
2022-05-13 11:01:33 +02:00
Miriam Baglioni
8a72de4011
[EOSCTag] modified workflow to execute all the steps and not only the last one
2022-05-04 10:10:56 +02:00
Miriam Baglioni
3aeedd931a
[EOSCTag] fixed issue in case description is null. Modified test resources and classes
2022-05-04 10:06:38 +02:00
Miriam Baglioni
a21fe310e5
[EOSCTag] last test and change in the implementation to search in title and descriptio
2022-05-02 17:43:20 +02:00
Miriam Baglioni
e342ec93f0
[EOSCTag] prepared resources for test
2022-04-22 18:35:37 +02:00
Miriam Baglioni
88562c0930
[EOSC TAG] added test for galaxy for title and description criterias
2022-04-22 18:35:03 +02:00
Miriam Baglioni
dfbd2bcbea
[EOSC TAG] added logic in case subject is null
2022-04-22 18:34:03 +02:00
Miriam Baglioni
27c85e901a
[EOSCTag] added resources and finalized test for Jupyter Notebook tagging
2022-04-22 17:38:10 +02:00
Miriam Baglioni
bbb77052d3
[EOSCTag] first test
2022-04-22 11:32:57 +02:00
Miriam Baglioni
7cb7066472
[EoscTag] first "rough" implementation
2022-04-22 10:44:17 +02:00
Miriam Baglioni
6dc68c48e0
[EOSCTag] -
2022-04-21 16:19:04 +02:00
Miriam Baglioni
d012d125d7
[EOSCTag] -
2022-04-21 12:02:09 +02:00
Miriam Baglioni
c5a863132c
[BulkTagging] revert it
2022-04-14 14:14:13 +02:00
Miriam Baglioni
8e8933d41a
[BulkTagging] added fix if result.dataInfo is null
2022-04-14 09:04:24 +02:00
Claudio Atzori
48b580b45c
[graph enrichment] fixed country_propagation oozie workflow definition, parameter saveGraph is not needed anymore by the SparkCountryPropagationJob
2022-04-11 08:52:36 +02:00
Claudio Atzori
21f32b83c6
[graph enrichment] fixed country_propagation oozie workflow definition, parameter saveGraph is not needed anymore by the SparkCountryPropagationJob
2022-04-11 08:52:12 +02:00
Claudio Atzori
c26222623f
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:32:22 +02:00
Claudio Atzori
86585a6b27
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:32:19 +02:00
Claudio Atzori
ad85d88eaf
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 13:28:35 +02:00
Claudio Atzori
598e11dfd7
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:27:02 +02:00
Claudio Atzori
db3d9877a5
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:26:58 +02:00
Claudio Atzori
3bba6d6e38
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 12:23:17 +02:00
Claudio Atzori
2ac2d928bd
[maven-release-plugin] prepare for next development iteration
2022-04-07 12:18:47 +02:00
Claudio Atzori
85bc722ff4
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 12:18:43 +02:00
Claudio Atzori
bc05b6168a
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 11:49:06 +02:00
Claudio Atzori
505420fd61
[maven-release-plugin] prepare for next development iteration
2022-04-07 11:34:06 +02:00
Claudio Atzori
66e718981e
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 11:34:02 +02:00
Miriam Baglioni
7b8f85692e
[Enrichment country] fixed issues with parameters and workflow args
2022-03-23 17:20:23 +01:00
Claudio Atzori
f10066547b
increased spark.sql.shuffle.partitions in affiliation_from_semrel_propagation
2022-03-23 12:22:26 +01:00
Claudio Atzori
f430029596
cleanup
2022-03-11 14:28:28 +01:00
Miriam Baglioni
12de9acb0d
[Country Propagation] left out from previous commit
2022-03-11 14:17:02 +01:00
Miriam Baglioni
4437f9345d
[Country Propagation] left out from previous commit
2022-03-11 13:57:47 +01:00
Miriam Baglioni
2b643059fa
[Country Propagation] changed the logic to get the collectedfrom at the result level. To fix issue when no instance is created for a result that should have the country associated. Change the code to use spark instead of hive to prepare the data needed for the propagation step. Added new tests for the intermediate steps and new verification for the propagation itself
2022-03-11 13:56:48 +01:00
Miriam Baglioni
f5b0a6f89c
[master to beta] fixed issues in test files
2022-02-25 10:21:57 +01:00
Miriam Baglioni
37784209c9
[dhp-schemas-] updated the version of dhp-schema to 2.10.27 for APC name and id modification
2022-02-02 12:46:31 +01:00
Miriam Baglioni
dce7f5fea8
[BULK TAGGING] changed to fix issue that should have been fixed already
2022-01-31 08:20:28 +01:00
Miriam Baglioni
064f9bbd87
[AFFPropSR] added new paprameter for the number of iterations and new code for just one iteration
2022-01-07 18:58:51 +01:00
Sandro La Bruzzo
3920d68992
Fixed workflow generation of delta in datacite
2021-12-21 11:41:49 +01:00
Claudio Atzori
1790fa2d44
Merge branch 'beta' into affiliationPropagation
2021-12-14 15:26:56 +01:00
Miriam Baglioni
2bbece2ca5
mergin with branch beta
2021-11-16 16:35:40 +01:00
Sandro La Bruzzo
2d67020c59
added dhp-enrichment maven site template
2021-11-16 16:01:08 +01:00
Miriam Baglioni
28ea532ece
[Affilaition Propagation] moved the selection of graph relation as a preparation step
2021-11-16 15:24:19 +01:00
Miriam Baglioni
7c96e3fd46
removed not useful dir
2021-11-16 13:57:26 +01:00
Miriam Baglioni
c7c0c3187b
[AFFILIATION PROPAGATION] Applied some SonarLint suggestions
2021-11-16 13:56:32 +01:00
Miriam Baglioni
935062edec
[Bypass Action Set] creation of unresolved entities
2021-11-11 16:11:25 +01:00
Miriam Baglioni
c371b23077
-
2021-11-10 17:00:37 +01:00
Miriam Baglioni
9e214ce0eb
[BypassAS] addition of OC relations
2021-11-09 12:07:19 +01:00
Miriam Baglioni
6f7ca539c6
[BypassAS] update of results for bipFinder and FOS
2021-11-09 11:25:41 +01:00
Miriam Baglioni
a7d50c499b
[BypassAS] prepare FOS subject, test and model for FOS and BipFinder scores
2021-11-08 16:44:19 +01:00
Miriam Baglioni
b9d124bb7c
[Enrichment: Propagation through parent-child relationships] Added counters, and changed constraint to verify if filtering out the relation (from classname = harvested to classid != propagation)
2021-11-03 13:55:37 +01:00
Miriam Baglioni
09f36cffb8
[Enrichment: Propagation through parent-child relationships] First implementation, testing, and wf for propagation of result to organization through semantic relation
2021-10-29 11:20:03 +02:00
Miriam Baglioni
d0ef7d91c5
adding test resource
2021-10-26 17:34:11 +02:00
Miriam Baglioni
652114c641
[affiliationPropagation] first try. preparetion
2021-10-20 11:44:23 +02:00
Sandro La Bruzzo
5606014b17
code refactor see ticket #7065
2021-10-12 08:11:53 +02:00
Miriam Baglioni
e9ccdf853f
related to D-Net/dnet-hadoop#132
2021-09-15 18:44:54 +02:00
Miriam Baglioni
5f674efb0c
moved dependency version in external pom
2021-08-13 10:07:53 +02:00
Claudio Atzori
2ee21da43b
suggestions from SonarLint
2021-08-11 12:13:22 +02:00
Claudio Atzori
741077dbca
Merge pull request 'Fix in Affiliation Propagation' ( #113 ) from miriam.baglioni/dnet-hadoop:master into stable_ids
...
Reviewed-on: D-Net/dnet-hadoop#113
2021-06-09 18:42:42 +02:00
Miriam Baglioni
32b0c27217
Aggiornare 'dhp-workflows/dhp-enrichment/src/main/java/eu/dnetlib/dhp/resulttoorganizationfrominstrepo/PrepareResultInstRepoAssociation.java'
...
fix in SQL query: while writing the blacklist constraint it used d.id to indicate the datasource id, but no alias for the datasource was defined. So I removed the alias
2021-06-09 18:36:11 +02:00
Miriam Baglioni
dc07f1079b
added check in case the author set to be enriched is null
2021-06-08 12:06:10 +02:00
Claudio Atzori
b695932ae4
integrated pull#108
2021-05-20 15:34:04 +02:00
Miriam Baglioni
02b80cf24f
resolved conflicts
2021-05-20 10:59:39 +02:00
Claudio Atzori
23b8883ab1
applied intellij code cleanup
2021-05-14 10:58:12 +02:00
Claudio Atzori
27ab8a704d
adjusted poms to align with the external dhp-schema module
2021-04-27 10:12:27 +02:00
Claudio Atzori
c2bb03c8b5
depending on external dhp-schemas module
2021-04-23 17:57:35 +02:00
Claudio Atzori
7ed107be53
depending on external dhp-schemas module
2021-04-23 17:52:36 +02:00
Miriam Baglioni
72e5aa3b42
refactoring
2021-04-23 12:10:30 +02:00
Miriam Baglioni
fe36895c53
added datasource blacklist for the organization to result propagation through institutional repositories
2021-01-22 11:55:10 +01:00
Claudio Atzori
4766495f5b
[orcid_to_result_from_semrel_propagation] fixed typo in SQL
2020-12-17 09:15:50 +01:00
Claudio Atzori
7d325e2c57
using actual result subclasses instead of their parent class
2020-12-14 14:40:54 +01:00
Claudio Atzori
152916890f
renamed test name
2020-12-14 14:40:05 +01:00
Miriam Baglioni
4c58bd1c93
merge with upstream
2020-12-03 11:24:00 +01:00
Miriam Baglioni
05c452f58d
merge with upstream
2020-12-03 10:26:45 +01:00
Claudio Atzori
cfb55effd9
code formatting
2020-12-02 11:23:49 +01:00
Claudio Atzori
74242e450e
using constants from ModelConstants
2020-12-02 11:23:35 +01:00
Miriam Baglioni
d5efa6963a
using constants in ModelCOnstants
2020-12-02 11:20:26 +01:00
Miriam Baglioni
cd285e98bc
usoing the constants defined in the ModelConstants class
2020-12-02 11:13:23 +01:00
Miriam Baglioni
f8468c9c22
added extention for new author pid (orcid_pending)
2020-12-01 20:09:35 +01:00
Miriam Baglioni
55e24c2547
relclass for relation and corresponding values have been put to lower case (isSupplementedBy wrote as IsSupplementedBy - orcid propagation)
2020-08-18 16:42:08 +02:00
Miriam Baglioni
bc6b5d5b34
removed leftover parameter
2020-08-15 11:22:35 +02:00
Miriam Baglioni
200cd5c730
removed leftover parameter
2020-08-15 11:22:19 +02:00
Miriam Baglioni
de995970ea
try again to solve clash with master
2020-08-14 15:24:36 +02:00
Miriam Baglioni
5040d72d5e
changed to make it equal to master branch
2020-08-14 15:20:17 +02:00
Miriam Baglioni
be8106c339
added space toavoid conflicts with master branch
2020-08-14 15:16:27 +02:00
Miriam Baglioni
b7e49aee8d
removed commented code
2020-08-13 18:44:07 +02:00
Miriam Baglioni
270c89489c
fixed issue created while renaming subject to subjects in community configuration xml
2020-08-13 15:16:04 +02:00
Miriam Baglioni
c3672b162b
merge branch with master
2020-08-11 17:53:04 +02:00
Miriam Baglioni
a16bbf3202
changed test resource to mirror change in the Xquery that produced data to be parsed. The main Zenodo community it is no more provided in a different element, but it is part of the <zenodocommunities>
2020-08-11 17:48:44 +02:00
Miriam Baglioni
5b651abf82
merge branch with master
2020-08-04 10:14:07 +02:00
Miriam Baglioni
88e4c3b751
added default trust to context bulktagged
2020-08-04 10:13:25 +02:00
Miriam Baglioni
f9342cb484
added constant
2020-08-03 18:32:35 +02:00
Miriam Baglioni
96c3c891f4
added trust
2020-08-03 18:32:17 +02:00
Miriam Baglioni
53656600ad
changed XQuery to select only community and ri with status not hidden
2020-08-03 18:29:30 +02:00
Miriam Baglioni
40bbe94f7c
merge with master fork
2020-07-20 18:10:03 +02:00
Miriam Baglioni
b904e0699a
-
2020-07-20 18:02:53 +02:00
Miriam Baglioni
d7d84c8217
-
2020-07-17 14:03:23 +02:00
Miriam Baglioni
faea30cda0
-
2020-07-09 14:05:21 +02:00
Miriam Baglioni
4a7de07ea2
refactoring
2020-06-25 16:32:40 +02:00
Miriam Baglioni
54a12978d3
fixed issue in xquery
2020-06-25 16:30:20 +02:00
Miriam Baglioni
507f7a94a8
added one of the main zenodo communities to the tagging conf for testing purposes
2020-06-23 08:45:27 +02:00
Miriam Baglioni
af1d40351b
changed XQuery to add also the main Zenodo community among the communities associated to the openaire community
2020-06-22 19:20:54 +02:00
Claudio Atzori
9cd27183b6
[maven-release-plugin] prepare for next development iteration
2020-06-22 11:27:44 +02:00
Claudio Atzori
1e3dab0631
[maven-release-plugin] prepare release dhp-1.2.3
2020-06-22 11:27:39 +02:00
Claudio Atzori
c4d9f1837f
[maven-release-plugin] prepare for next development iteration
2020-06-12 12:21:08 +02:00
Claudio Atzori
f0746a7605
[maven-release-plugin] prepare release dhp-1.2.2
2020-06-12 12:21:03 +02:00
Claudio Atzori
55595d7235
HACK: patch NULL values with defaults found in result.datainfo.deletedbyinference and result.context
2020-05-26 10:28:35 +02:00
Miriam Baglioni
54d869e618
merge upstream
2020-05-26 09:22:04 +02:00
Miriam Baglioni
eea07f4c42
refactoring
2020-05-26 09:21:49 +02:00
Claudio Atzori
7582532e73
[maven-release-plugin] prepare for next development iteration
2020-05-25 19:48:18 +02:00
Claudio Atzori
01c2e93395
[maven-release-plugin] prepare release dhp-1.2.1
2020-05-25 19:48:14 +02:00
Miriam Baglioni
74215f6d9f
refactoring
2020-05-25 10:38:16 +02:00
Miriam Baglioni
f754c424bd
changed logic to compute only onece PacePerson for each Author to be enriched
2020-05-25 10:35:02 +02:00
Miriam Baglioni
8f51af4e9b
added PacePerson to get name surname for authors having only fullname set
2020-05-25 10:34:30 +02:00
Miriam Baglioni
b258f99ece
fix for issue that duplicated result
2020-05-25 10:26:48 +02:00
Miriam Baglioni
0d1ec1913f
added fix to avoid duplication of results
2020-05-22 18:42:25 +02:00
Miriam Baglioni
29066a6b46
applied code cleanup
2020-05-22 15:38:50 +02:00
Miriam Baglioni
8610ad5142
added groupby id to fix multiple result with same id at join step
2020-05-22 15:32:55 +02:00
Miriam Baglioni
4308f31165
added fix to make test run
2020-05-22 13:13:01 +02:00
Miriam Baglioni
b71fbb68b1
removed the removeOutputDir command from code. Reltions are written in Append. The erase of the output dir ment to remove all the relations computed in the prevoius steps
2020-05-18 13:57:20 +02:00
Claudio Atzori
ef9a9a9f1a
remove the outout path when starting
2020-05-15 22:34:19 +02:00
Claudio Atzori
a832658296
code formatting
2020-05-15 10:21:09 +02:00
Miriam Baglioni
f25db01664
changed in the constant from propagationconstants to modelconstants
2020-05-14 18:29:24 +02:00
Miriam Baglioni
d05630d979
removed the constants added in ModelConstants
2020-05-14 18:22:50 +02:00
Miriam Baglioni
e7eb4f377e
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
2020-05-14 10:34:17 +02:00
Miriam Baglioni
8828458acf
minor changes
2020-05-14 10:34:12 +02:00
Claudio Atzori
ab37953332
added global properties in wf definitions to avoid repeating name-node and job-tracker in the (many) distcp actions; reintroduced output directory removal at the beginning of each spark action
2020-05-14 10:25:41 +02:00
Miriam Baglioni
43f127448d
changed the package name from dhp-propagation to dhp-enrichment for the preparation phase of funding propagation
2020-05-12 18:24:26 +02:00
Claudio Atzori
ec0782e582
renamed jar containing the bulktagging and propagation workflows from dhp-[bulktagging|propagation] to dhp-enrichment; adjusted xml formatting
2020-05-12 15:49:28 +02:00
Miriam Baglioni
14979f299e
changed the configuration factory
2020-05-12 11:28:38 +02:00