Miriam Baglioni
7b8f85692e
[Enrichment country] fixed issues with parameters and workflow args
2022-03-23 17:20:23 +01:00
Claudio Atzori
f10066547b
increased spark.sql.shuffle.partitions in affiliation_from_semrel_propagation
2022-03-23 12:22:26 +01:00
Claudio Atzori
f430029596
cleanup
2022-03-11 14:28:28 +01:00
Miriam Baglioni
12de9acb0d
[Country Propagation] left out from previous commit
2022-03-11 14:17:02 +01:00
Miriam Baglioni
4437f9345d
[Country Propagation] left out from previous commit
2022-03-11 13:57:47 +01:00
Miriam Baglioni
2b643059fa
[Country Propagation] changed the logic to get the collectedfrom at the result level. To fix issue when no instance is created for a result that should have the country associated. Change the code to use spark instead of hive to prepare the data needed for the propagation step. Added new tests for the intermediate steps and new verification for the propagation itself
2022-03-11 13:56:48 +01:00
Miriam Baglioni
f5b0a6f89c
[master to beta] fixed issues in test files
2022-02-25 10:21:57 +01:00
Miriam Baglioni
37784209c9
[dhp-schemas-] updated the version of dhp-schema to 2.10.27 for APC name and id modification
2022-02-02 12:46:31 +01:00
Miriam Baglioni
dce7f5fea8
[BULK TAGGING] changed to fix issue that should have been fixed already
2022-01-31 08:20:28 +01:00
Miriam Baglioni
064f9bbd87
[AFFPropSR] added new paprameter for the number of iterations and new code for just one iteration
2022-01-07 18:58:51 +01:00
Sandro La Bruzzo
3920d68992
Fixed workflow generation of delta in datacite
2021-12-21 11:41:49 +01:00
Claudio Atzori
1790fa2d44
Merge branch 'beta' into affiliationPropagation
2021-12-14 15:26:56 +01:00
Miriam Baglioni
2bbece2ca5
mergin with branch beta
2021-11-16 16:35:40 +01:00
Sandro La Bruzzo
2d67020c59
added dhp-enrichment maven site template
2021-11-16 16:01:08 +01:00
Miriam Baglioni
28ea532ece
[Affilaition Propagation] moved the selection of graph relation as a preparation step
2021-11-16 15:24:19 +01:00
Miriam Baglioni
c7c0c3187b
[AFFILIATION PROPAGATION] Applied some SonarLint suggestions
2021-11-16 13:56:32 +01:00
Miriam Baglioni
935062edec
[Bypass Action Set] creation of unresolved entities
2021-11-11 16:11:25 +01:00
Miriam Baglioni
c371b23077
-
2021-11-10 17:00:37 +01:00
Miriam Baglioni
9e214ce0eb
[BypassAS] addition of OC relations
2021-11-09 12:07:19 +01:00
Miriam Baglioni
6f7ca539c6
[BypassAS] update of results for bipFinder and FOS
2021-11-09 11:25:41 +01:00
Miriam Baglioni
a7d50c499b
[BypassAS] prepare FOS subject, test and model for FOS and BipFinder scores
2021-11-08 16:44:19 +01:00
Miriam Baglioni
b9d124bb7c
[Enrichment: Propagation through parent-child relationships] Added counters, and changed constraint to verify if filtering out the relation (from classname = harvested to classid != propagation)
2021-11-03 13:55:37 +01:00
Miriam Baglioni
09f36cffb8
[Enrichment: Propagation through parent-child relationships] First implementation, testing, and wf for propagation of result to organization through semantic relation
2021-10-29 11:20:03 +02:00
Miriam Baglioni
d0ef7d91c5
adding test resource
2021-10-26 17:34:11 +02:00
Miriam Baglioni
652114c641
[affiliationPropagation] first try. preparetion
2021-10-20 11:44:23 +02:00
Sandro La Bruzzo
5606014b17
code refactor see ticket #7065
2021-10-12 08:11:53 +02:00
Miriam Baglioni
e9ccdf853f
related to #132
2021-09-15 18:44:54 +02:00
Claudio Atzori
2ee21da43b
suggestions from SonarLint
2021-08-11 12:13:22 +02:00
Claudio Atzori
741077dbca
Merge pull request 'Fix in Affiliation Propagation' ( #113 ) from miriam.baglioni/dnet-hadoop:master into stable_ids
...
Reviewed-on: #113
2021-06-09 18:42:42 +02:00
Miriam Baglioni
32b0c27217
Aggiornare 'dhp-workflows/dhp-enrichment/src/main/java/eu/dnetlib/dhp/resulttoorganizationfrominstrepo/PrepareResultInstRepoAssociation.java'
...
fix in SQL query: while writing the blacklist constraint it used d.id to indicate the datasource id, but no alias for the datasource was defined. So I removed the alias
2021-06-09 18:36:11 +02:00
Miriam Baglioni
dc07f1079b
added check in case the author set to be enriched is null
2021-06-08 12:06:10 +02:00
Claudio Atzori
b695932ae4
integrated pull#108
2021-05-20 15:34:04 +02:00
Claudio Atzori
23b8883ab1
applied intellij code cleanup
2021-05-14 10:58:12 +02:00
Miriam Baglioni
72e5aa3b42
refactoring
2021-04-23 12:10:30 +02:00
Miriam Baglioni
fe36895c53
added datasource blacklist for the organization to result propagation through institutional repositories
2021-01-22 11:55:10 +01:00
Claudio Atzori
4766495f5b
[orcid_to_result_from_semrel_propagation] fixed typo in SQL
2020-12-17 09:15:50 +01:00
Claudio Atzori
7d325e2c57
using actual result subclasses instead of their parent class
2020-12-14 14:40:54 +01:00
Claudio Atzori
152916890f
renamed test name
2020-12-14 14:40:05 +01:00
Miriam Baglioni
4c58bd1c93
merge with upstream
2020-12-03 11:24:00 +01:00
Miriam Baglioni
05c452f58d
merge with upstream
2020-12-03 10:26:45 +01:00
Claudio Atzori
cfb55effd9
code formatting
2020-12-02 11:23:49 +01:00
Claudio Atzori
74242e450e
using constants from ModelConstants
2020-12-02 11:23:35 +01:00
Miriam Baglioni
d5efa6963a
using constants in ModelCOnstants
2020-12-02 11:20:26 +01:00
Miriam Baglioni
cd285e98bc
usoing the constants defined in the ModelConstants class
2020-12-02 11:13:23 +01:00
Miriam Baglioni
f8468c9c22
added extention for new author pid (orcid_pending)
2020-12-01 20:09:35 +01:00
Miriam Baglioni
55e24c2547
relclass for relation and corresponding values have been put to lower case (isSupplementedBy wrote as IsSupplementedBy - orcid propagation)
2020-08-18 16:42:08 +02:00
Miriam Baglioni
bc6b5d5b34
removed leftover parameter
2020-08-15 11:22:35 +02:00
Miriam Baglioni
200cd5c730
removed leftover parameter
2020-08-15 11:22:19 +02:00
Miriam Baglioni
de995970ea
try again to solve clash with master
2020-08-14 15:24:36 +02:00
Miriam Baglioni
5040d72d5e
changed to make it equal to master branch
2020-08-14 15:20:17 +02:00
Miriam Baglioni
be8106c339
added space toavoid conflicts with master branch
2020-08-14 15:16:27 +02:00
Miriam Baglioni
b7e49aee8d
removed commented code
2020-08-13 18:44:07 +02:00
Miriam Baglioni
270c89489c
fixed issue created while renaming subject to subjects in community configuration xml
2020-08-13 15:16:04 +02:00
Miriam Baglioni
c3672b162b
merge branch with master
2020-08-11 17:53:04 +02:00
Miriam Baglioni
a16bbf3202
changed test resource to mirror change in the Xquery that produced data to be parsed. The main Zenodo community it is no more provided in a different element, but it is part of the <zenodocommunities>
2020-08-11 17:48:44 +02:00
Miriam Baglioni
5b651abf82
merge branch with master
2020-08-04 10:14:07 +02:00
Miriam Baglioni
88e4c3b751
added default trust to context bulktagged
2020-08-04 10:13:25 +02:00
Miriam Baglioni
f9342cb484
added constant
2020-08-03 18:32:35 +02:00
Miriam Baglioni
96c3c891f4
added trust
2020-08-03 18:32:17 +02:00
Miriam Baglioni
53656600ad
changed XQuery to select only community and ri with status not hidden
2020-08-03 18:29:30 +02:00
Miriam Baglioni
40bbe94f7c
merge with master fork
2020-07-20 18:10:03 +02:00
Miriam Baglioni
b904e0699a
-
2020-07-20 18:02:53 +02:00
Miriam Baglioni
d7d84c8217
-
2020-07-17 14:03:23 +02:00
Miriam Baglioni
faea30cda0
-
2020-07-09 14:05:21 +02:00
Miriam Baglioni
4a7de07ea2
refactoring
2020-06-25 16:32:40 +02:00
Miriam Baglioni
54a12978d3
fixed issue in xquery
2020-06-25 16:30:20 +02:00
Miriam Baglioni
507f7a94a8
added one of the main zenodo communities to the tagging conf for testing purposes
2020-06-23 08:45:27 +02:00
Miriam Baglioni
af1d40351b
changed XQuery to add also the main Zenodo community among the communities associated to the openaire community
2020-06-22 19:20:54 +02:00
Claudio Atzori
55595d7235
HACK: patch NULL values with defaults found in result.datainfo.deletedbyinference and result.context
2020-05-26 10:28:35 +02:00
Miriam Baglioni
eea07f4c42
refactoring
2020-05-26 09:21:49 +02:00
Miriam Baglioni
74215f6d9f
refactoring
2020-05-25 10:38:16 +02:00
Miriam Baglioni
f754c424bd
changed logic to compute only onece PacePerson for each Author to be enriched
2020-05-25 10:35:02 +02:00
Miriam Baglioni
8f51af4e9b
added PacePerson to get name surname for authors having only fullname set
2020-05-25 10:34:30 +02:00
Miriam Baglioni
b258f99ece
fix for issue that duplicated result
2020-05-25 10:26:48 +02:00
Miriam Baglioni
0d1ec1913f
added fix to avoid duplication of results
2020-05-22 18:42:25 +02:00
Miriam Baglioni
29066a6b46
applied code cleanup
2020-05-22 15:38:50 +02:00
Miriam Baglioni
8610ad5142
added groupby id to fix multiple result with same id at join step
2020-05-22 15:32:55 +02:00
Miriam Baglioni
4308f31165
added fix to make test run
2020-05-22 13:13:01 +02:00
Miriam Baglioni
b71fbb68b1
removed the removeOutputDir command from code. Reltions are written in Append. The erase of the output dir ment to remove all the relations computed in the prevoius steps
2020-05-18 13:57:20 +02:00
Claudio Atzori
ef9a9a9f1a
remove the outout path when starting
2020-05-15 22:34:19 +02:00
Claudio Atzori
a832658296
code formatting
2020-05-15 10:21:09 +02:00
Miriam Baglioni
f25db01664
changed in the constant from propagationconstants to modelconstants
2020-05-14 18:29:24 +02:00
Miriam Baglioni
d05630d979
removed the constants added in ModelConstants
2020-05-14 18:22:50 +02:00
Miriam Baglioni
e7eb4f377e
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
2020-05-14 10:34:17 +02:00
Miriam Baglioni
8828458acf
minor changes
2020-05-14 10:34:12 +02:00
Claudio Atzori
ab37953332
added global properties in wf definitions to avoid repeating name-node and job-tracker in the (many) distcp actions; reintroduced output directory removal at the beginning of each spark action
2020-05-14 10:25:41 +02:00
Miriam Baglioni
43f127448d
changed the package name from dhp-propagation to dhp-enrichment for the preparation phase of funding propagation
2020-05-12 18:24:26 +02:00
Claudio Atzori
ec0782e582
renamed jar containing the bulktagging and propagation workflows from dhp-[bulktagging|propagation] to dhp-enrichment; adjusted xml formatting
2020-05-12 15:49:28 +02:00
Miriam Baglioni
14979f299e
changed the configuration factory
2020-05-12 11:28:38 +02:00
Miriam Baglioni
f8aef6161a
minor modification
2020-05-12 11:28:07 +02:00
Miriam Baglioni
7387f3449a
changed the route to find the verb resolver classes
2020-05-12 11:27:38 +02:00
Miriam Baglioni
7687519f00
merged conflicts with upstream branch
2020-05-12 10:03:44 +02:00
Miriam Baglioni
8ffc050b8a
fixed problem in communityconfigurationfactory test
2020-05-12 10:01:09 +02:00
Claudio Atzori
527e8169a8
adjusted paths pointing to test configurations, cleanup
2020-05-11 18:17:05 +02:00
Claudio Atzori
c6b028f2af
code formatting
2020-05-11 17:38:08 +02:00
Claudio Atzori
6d0b11252e
bulktagging wfs moved into common dhp-enrichment module
2020-05-11 17:32:06 +02:00