Michele Artini
b94a791bc5
unit tests to transform cnr explora
2022-06-09 12:25:34 +02:00
Claudio Atzori
d098ad0d93
[hb patch] updated map
2022-05-16 15:54:04 +02:00
Claudio Atzori
1dda11e031
[hb patch] updated map
2022-05-16 15:53:27 +02:00
Claudio Atzori
8dd5517548
code formatting
2022-05-16 14:35:24 +02:00
Michele Artini
46c07e0724
deleted hierarchical rels from ror action set
2022-05-16 09:39:54 +02:00
Miriam Baglioni
89657a0b78
[UsageCount] refactoring
2022-05-09 14:43:27 +02:00
Miriam Baglioni
a056f59c6e
[UsageCount] make it as an action set as it should be, plus changed the test to make them work as well now
2022-05-09 12:51:35 +02:00
Serafeim Chatzopoulos
623f7be26d
Fix reading files from HDFS in FileCollector & FileGZipCollector plugins
2022-04-28 16:31:11 +03:00
Claudio Atzori
30105f0722
Merge branch 'beta' into 7096-fileGZip-collector-plugin
2022-04-22 11:22:21 +02:00
Miriam Baglioni
20de75ca64
[Measures] removed typo
2022-04-21 12:14:03 +02:00
Miriam Baglioni
b61efd613b
[Measures] addressed comments in the PR
2022-04-21 12:09:37 +02:00
Claudio Atzori
eabb40fccc
Merge branch 'beta' into 7096-fileGZip-collector-plugin
2022-04-21 11:42:43 +02:00
Miriam Baglioni
c304657d91
[Measures] put the logic in common, no need to change the schema
2022-04-21 11:27:26 +02:00
Miriam Baglioni
5295effc96
[Measures] fixed issue
2022-04-20 16:20:40 +02:00
Miriam Baglioni
5feae77937
[Measures] last changes to accomodate tests
2022-04-20 15:13:09 +02:00
Miriam Baglioni
869407c6e2
[Measures] added new measure (usagecounts) as action set. Measure added at the level of the result. Ref #7587
2022-04-20 14:02:05 +02:00
Serafeim Chatzopoulos
d0b84d3297
Add FileCollectorPlugin and respective test
2022-04-07 15:06:38 +03:00
Serafeim Chatzopoulos
bc1bf55507
Add AbstractSplittedRecordPlugin
2022-04-07 14:33:04 +03:00
Claudio Atzori
c26222623f
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:32:22 +02:00
Claudio Atzori
86585a6b27
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:32:19 +02:00
Claudio Atzori
ad85d88eaf
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 13:28:35 +02:00
Claudio Atzori
598e11dfd7
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:27:02 +02:00
Claudio Atzori
db3d9877a5
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:26:58 +02:00
Claudio Atzori
3bba6d6e38
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 12:23:17 +02:00
Claudio Atzori
2ac2d928bd
[maven-release-plugin] prepare for next development iteration
2022-04-07 12:18:47 +02:00
Claudio Atzori
85bc722ff4
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 12:18:43 +02:00
Claudio Atzori
bc05b6168a
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 11:49:06 +02:00
Claudio Atzori
505420fd61
[maven-release-plugin] prepare for next development iteration
2022-04-07 11:34:06 +02:00
Claudio Atzori
66e718981e
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 11:34:02 +02:00
Serafeim Chatzopoulos
e612489670
Add fileGZip collector plugin and respective test
2022-04-06 19:12:44 +03:00
Claudio Atzori
8c457f1b2c
conflicts resolved, merged from beta
2022-04-06 10:27:52 +02:00
Miriam Baglioni
e77d104951
[OC] added / to workflow path
2022-04-05 15:07:11 +02:00
Sandro La Bruzzo
1b11010169
minor fix
2022-03-29 10:59:14 +02:00
Claudio Atzori
a87c070447
conflicts resolved, merged from beta
2022-02-24 12:51:31 +01:00
Claudio Atzori
5226d0a100
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2022-02-18 15:21:07 +01:00
Claudio Atzori
401dd38074
code formatting
2022-02-18 15:19:15 +01:00
Sandro La Bruzzo
891781ee3f
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta
2022-02-18 11:11:32 +01:00
Sandro La Bruzzo
d3f03abd51
fixed wrong json path
2022-02-18 11:11:17 +01:00
Claudio Atzori
89c7313fc5
Merge branch 'beta' into hierarchical_orgs_relations
2022-02-17 10:30:04 +01:00
Sandro La Bruzzo
3aa2020b24
added script to regenerate hostedBy Map following instruction defined on ticket #7539
...
updated hosted By Map
2022-02-15 11:05:27 +01:00
Miriam Baglioni
be64055cfe
[OpenCitation] changed the name of destination folders
2022-02-14 15:49:44 +01:00
Miriam Baglioni
1490867cc7
[OpenCitation] cleaning of the COCI model
2022-02-14 14:52:12 +01:00
Miriam Baglioni
5c4043dba8
[OpenCitation] refactoring
2022-02-08 16:23:05 +01:00
Miriam Baglioni
759ed519f2
[OpenCitation] added logic to avoid the genration of self citations relations
2022-02-08 16:15:34 +01:00
Miriam Baglioni
b071f8e415
[OpenCitation] change to extract in json format each folder just onece
2022-02-08 15:37:28 +01:00
Miriam Baglioni
fbc28ee8c3
[OpenCitation] change the integration logic to consider dois with commas inside
2022-02-07 18:32:08 +01:00
Miriam Baglioni
493caef358
[stats-wf]fixed the result_result table related to PR#191
2022-02-04 14:51:25 +01:00
Miriam Baglioni
73eba34d42
[UnresolvedEntities] Changed the way to merge the unresolved because the new merge removed the dataInfo from the merged result. Added also data info for subjects
2022-02-01 08:38:41 +01:00
Claudio Atzori
b37bc277c4
reintroduced the hostedby patching to the datacite records
2022-01-21 09:15:13 +01:00
Miriam Baglioni
a75fb8c47a
[BipFinderInstanceLevel] change pom to align to the dhp-schema release 2.10.24 and refactoring
2022-01-12 18:06:26 +01:00
Miriam Baglioni
e7d5a39c03
[BipFinderInstanceLevel] added tests in test class
2022-01-12 17:25:04 +01:00
Miriam Baglioni
4993666d73
[BipFinderInstanceLevel] changed creation of the instance to allow to enrich existing instances with same pid
2022-01-12 16:53:47 +01:00
Sandro La Bruzzo
57e2c4b749
formatted code
2022-01-12 09:40:28 +01:00
Claudio Atzori
dcd282977c
pulled from beta
2022-01-11 16:59:41 +01:00
Claudio Atzori
4f212652ca
scalafmt: code formatting
2022-01-11 16:57:48 +01:00
Miriam Baglioni
b7e450070b
[SDG-FOS] to import SDG file not considering the header
2022-01-07 12:13:26 +01:00
Miriam Baglioni
639190370a
mergin with branch beta
2022-01-07 11:29:25 +01:00
Miriam Baglioni
adccc2346a
[SDG-FOS] to lower case for the doi
2022-01-07 11:28:50 +01:00
Claudio Atzori
3bd3653be9
OAF-store-graph mdstores: save them in text format
2022-01-04 16:39:39 +01:00
Claudio Atzori
3dc48c7ab5
OAF-store-graph mdstores: save them in text format
2022-01-04 16:39:27 +01:00
Claudio Atzori
f82db765db
OAF-store-graph mdstores: save them in text format
2022-01-04 16:39:15 +01:00
Claudio Atzori
9458ee7938
serialise records in the OAF-store-graph mdstores in json format. Read them again in the graph construction phase using a tolerant parser to support backward compatible changes in the evolution of the schema
2022-01-04 16:38:09 +01:00
Claudio Atzori
58f8998e3d
OAF-store-graph mdstores: save them in text format
2022-01-04 15:02:09 +01:00
Claudio Atzori
174c3037e1
OAF-store-graph mdstores: save them in text format
2022-01-04 14:40:16 +01:00
Claudio Atzori
045d767013
OAF-store-graph mdstores: save them in text format
2022-01-04 14:23:01 +01:00
Claudio Atzori
a6977197b3
serialise records in the OAF-store-graph mdstores in json format. Read them again in the graph construction phase using a tolerant parser to support backward compatible changes in the evolution of the schema
2022-01-03 17:25:26 +01:00
Miriam Baglioni
92fd69e25d
[SDG-FOS] alternative way to get input data to avoid OOM error while getting csv
2022-01-03 15:23:06 +01:00
Miriam Baglioni
7a1b440413
[SDG] logic to create unresolved entities out of SDG input. This changes also some classes related to FOS to reuse the same code. The code under createunresolvedentities create results with the merged update of the the inputs provided (bip at the level of the isntance, fos and sdg for subjects)
2021-12-23 13:24:28 +01:00
Miriam Baglioni
2a67ee13ec
[SDG] added model class
2021-12-23 10:37:52 +01:00
Miriam Baglioni
10579c0dd0
[FOS]fixed doi value in test
2021-12-22 23:10:16 +01:00
Miriam Baglioni
6116fc5d40
[FOS]added logic to include only different subjects. Test refactoring and extention
2021-12-22 23:04:22 +01:00
Miriam Baglioni
b81efb6a9d
[FOS]changed the mapping between the csv and the model. Changed Test classes and resources
2021-12-22 21:40:35 +01:00
Miriam Baglioni
de6c4c8968
[FOS]creation of the unresolved entities: remove the split for the doi: no more needed since each row is related to one doi
2021-12-22 16:44:44 +01:00
Miriam Baglioni
34ac56565d
refactoring
2021-12-22 16:28:11 +01:00
Miriam Baglioni
20ef1d657f
refactoring
2021-12-22 16:26:36 +01:00
Miriam Baglioni
813f856d3f
[BipFinder] removing left over parameter in wf
2021-12-22 16:11:12 +01:00
Miriam Baglioni
2c126ed014
[BipFinder] create unresolved entities with measures at the level of the instance
2021-12-22 16:03:41 +01:00
Miriam Baglioni
0807fdb65a
[BipFinder] remove not needed resources
2021-12-22 15:37:00 +01:00
Miriam Baglioni
b5e11a3a0a
[BipFinder] put in common package BipFinder model
2021-12-22 15:33:05 +01:00
Miriam Baglioni
c5739c4266
[BipFinder] create action set for the measures at the level of the result
2021-12-22 15:08:33 +01:00
Miriam Baglioni
e24a7f3496
mergin with branch beta
2021-12-21 13:57:19 +01:00
Miriam Baglioni
d1ae219cb4
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2021-12-21 13:55:53 +01:00
Sandro La Bruzzo
3920d68992
Fixed workflow generation of delta in datacite
2021-12-21 11:41:49 +01:00
Sandro La Bruzzo
b881ee5ef8
[scholexplorer]
...
- implemented generation of scholix of delta update of datacite
2021-12-15 11:25:32 +01:00
Miriam Baglioni
22d4b5619b
[BipFinder Result] last changes to test and resources files
2021-12-14 14:54:13 +01:00
Miriam Baglioni
6fb6236cd4
changed the way to produce the AS for bipFinder.
2021-12-14 14:51:14 +01:00
Miriam Baglioni
4eb8276493
-
2021-12-14 11:12:17 +01:00
Miriam Baglioni
b113586207
resolved conflicts
2021-12-07 10:16:14 +01:00
Miriam Baglioni
d9836f0cf3
[OpenCitations] fixed test when executed one after the other
2021-12-06 15:27:09 +01:00
Sandro La Bruzzo
7af0bbd0b1
[scala-refactor] Module dhp-aggregation:
...
Moved all scala source into src/main/scala and src/test/scala
2021-12-06 11:26:36 +01:00
Miriam Baglioni
f430688ff7
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2021-12-03 12:36:08 +01:00
Sandro La Bruzzo
f7011b90d8
format code
2021-12-03 11:15:09 +01:00
Miriam Baglioni
87eedad898
-
2021-12-02 13:17:19 +01:00
Sandro La Bruzzo
feea154e89
remove working dir after test
2021-11-25 11:02:38 +01:00
Sandro La Bruzzo
028a8acad8
add test resources
2021-11-25 10:54:47 +01:00
Sandro La Bruzzo
2164a2a889
Datacite: Code Refactor generated a general SparkApplication Scala where all the spark scala have to inherit
...
Commented a little the Datacite transformation code
2021-11-25 10:54:13 +01:00
Sandro La Bruzzo
a7cf277d98
Datacite: Removed HostedBy Patch as described on ticket #7219 , Now all the records will have hosted by Unknown Repository
2021-11-22 16:03:17 +01:00
Sandro La Bruzzo
fc03c99805
fixed javadocs url after deploying site
2021-11-19 10:46:33 +01:00
Claudio Atzori
bd9a43cefd
Revert to 4094f2bb9a
2021-11-19 09:20:43 +01:00
Claudio Atzori
3a4d925386
Merge branch 'beta' into hierarchical_orgs_relations
2021-11-18 18:07:08 +01:00
Sandro La Bruzzo
2506d7a679
Merge branch 'mvn_site_documentation' of code-repo.d4science.org:D-Net/dnet-hadoop into mvn_site_documentation
2021-11-17 11:07:24 +01:00
Sandro La Bruzzo
cded363b55
code refactor, created and moved scala code on the correct maven folder under src/main/scala and src/test/scala
2021-11-17 11:06:35 +01:00
Miriam Baglioni
4094f2bb9a
added integration md file
2021-11-17 10:04:52 +01:00
Miriam Baglioni
ec8b0219ff
[Documentation] Added first page for Integration via unresolved entities generation
2021-11-16 17:41:34 +01:00
Sandro La Bruzzo
2d67020c59
added dhp-enrichment maven site template
2021-11-16 16:01:08 +01:00
Claudio Atzori
bafa2990f3
code formatting
2021-11-15 17:07:16 +01:00
Sandro La Bruzzo
efa09057db
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta
2021-11-15 14:32:09 +01:00
Sandro La Bruzzo
48923e46a1
added documentation to Pubmed Class and also added mvn site for dhp-aggregations
2021-11-15 14:32:01 +01:00
Miriam Baglioni
4ec88c718c
merge with beta - resolved conflict in pom
2021-11-15 10:52:16 +01:00
Miriam Baglioni
6f1a434e90
[Bypass Action Set] Fixed test to consider the new identifier utils
2021-11-15 09:59:23 +01:00
Miriam Baglioni
157d33ebf9
[Bypass Action Set] Refactoring
2021-11-15 09:58:48 +01:00
Miriam Baglioni
92d0e18b55
[Bypass Action Set] used constant DOI instead of "doi"
2021-11-12 10:56:58 +01:00
Miriam Baglioni
881113743f
[Bypass Action Set] refactoring
2021-11-12 10:55:50 +01:00
Miriam Baglioni
47ccb53c4f
[Bypass Action Set] modification for comment D-Net/dnet-hadoop#157 (comment)
2021-11-12 10:54:09 +01:00
Miriam Baglioni
716021546e
[Bypass Action Set] minor fix
2021-11-12 10:18:01 +01:00
Miriam Baglioni
935062edec
[Bypass Action Set] creation of unresolved entities
2021-11-11 16:11:25 +01:00
Claudio Atzori
d02caef185
Merge branch 'beta' into hierarchical_orgs_relations
2021-10-27 15:36:29 +02:00
Sandro La Bruzzo
4acfa8fa2e
Scholexplorer Datasource Aggregation:
...
- Added collectedfrom in the inverse relation generated
Relation resolution:
- increased number of partitions in workflow.xml
- using classid instead of classname to build the pid-dnetId mapping
2021-10-26 17:51:20 +02:00
Sandro La Bruzzo
034304b33a
conflict resolved on merge
2021-10-26 09:40:47 +02:00
Michele Artini
d66e20e7ac
added hierarchy rel in ROR actionset
2021-10-21 15:51:48 +02:00
Sandro La Bruzzo
aeeebd573b
code refactor renamed datacite package
2021-10-20 17:37:42 +02:00
Sandro La Bruzzo
ab3a99d3e9
removed old datacite oozie workflow
2021-10-20 17:19:47 +02:00
Sandro La Bruzzo
ae4e99a471
Adapted workflow of resolution of PID to work into OpenAIRE data workflow
...
- Added relations in both verse on all Scholexplorer datasources
2021-10-20 17:12:16 +02:00
Miriam Baglioni
1cc09adfaa
Opencitations: chenaged the test class to mirror the creation or not of duplicate dois for .refs oc original plus added optional parameter to duplicate the relation
2021-10-18 14:11:27 +02:00
Sandro La Bruzzo
7b15b88d4c
renamed wrong package, implemented last aggregation workflow for scholexplorer
2021-10-15 15:00:15 +02:00
Sandro La Bruzzo
51a03c0a50
refactor code for EBI from dhp-graph-mapper into dhp-aggregation
2021-10-14 14:23:13 +02:00
Sandro La Bruzzo
7387416e90
added params skip update to direct transform in OAF, this should be set to true in production
2021-10-12 12:36:30 +02:00
Sandro La Bruzzo
511da98d0c
- fixed bug on download pmc Article
...
- removed unused line of code in SparkCreateActionset
2021-10-12 11:47:49 +02:00
Sandro La Bruzzo
5606014b17
code refactor see ticket #7065
2021-10-12 08:11:53 +02:00
Sandro La Bruzzo
66702b1973
Added node to update datacite
2021-09-28 08:59:06 +02:00
Miriam Baglioni
5ec69889db
OpenCitations: creation of AS from OC
2021-09-27 16:02:06 +02:00
Miriam Baglioni
f2118d771a
first steps in the implementation of the integration of opencitations
2021-09-22 15:18:05 +02:00
Claudio Atzori
663b1556d7
manually integrating PR#140 D-Net/dnet-hadoop#140
2021-09-15 16:40:25 +02:00
Sandro La Bruzzo
aed29156c7
changed behavior in transformation job, that doesn't fail at first error
2021-09-07 19:05:46 +02:00
Sandro La Bruzzo
3c6fc2096c
fix bug on oai iterator that skip record cleaned
2021-09-07 10:46:26 +02:00
Sandro La Bruzzo
9f8a80deb7
fixed wrong import of unresolved relation in openaire
2021-09-01 14:16:27 +02:00
Sandro La Bruzzo
e8b3cb9147
Implemented method to download delta updates in EBI Links
2021-08-30 09:32:45 +02:00
Alessia Bardi
931f430129
Merge branch 'beta' into datasource_model_eosc_beta
2021-08-23 11:57:21 +02:00
Claudio Atzori
baed5e3337
test classes moved in specific components
2021-08-13 12:14:47 +02:00
Claudio Atzori
3359f73fcf
cleanup & best practices
2021-08-13 12:00:42 +02:00
Miriam Baglioni
32fd75691f
refactoring
2021-08-13 10:15:42 +02:00
Miriam Baglioni
5cd5714530
GetCSV refactoring - added ignore annotation for fields not in input csv
2021-08-13 10:06:49 +02:00
Miriam Baglioni
ed183d878e
GetCSV refactoring - modified test classes due to change in the model of projects and programme
2021-08-13 09:28:51 +02:00
Miriam Baglioni
8769dd8eef
GetCSV refactoring - refactoring due to movement of classes
2021-08-12 18:20:56 +02:00
Miriam Baglioni
6b9e1bf2e3
GetCSV refactoring - removing not needed dependency
2021-08-12 18:17:50 +02:00
Miriam Baglioni
d57b2bb927
GetCSV refactoring - removing not needed dependency
2021-08-12 18:12:51 +02:00
Miriam Baglioni
9da74b544a
GetCSV refactoring - refactoring due to movement of classes
2021-08-12 18:12:15 +02:00
Miriam Baglioni
ab8abd61bb
GetCSV refactoring - refactoring due to movement of classes
2021-08-12 18:11:07 +02:00
Miriam Baglioni
335a824e34
GetCSV refactoring - fixed issue
2021-08-12 18:10:10 +02:00
Miriam Baglioni
f0845e9865
GetCSV refactoring - refactoring due to movement of classes
2021-08-12 18:04:58 +02:00