Claudio Atzori
|
f9fb2fef6e
|
Merge pull request 'Modification of Microsoft Academic Graph Mapping' (#435) from mag_only_doi into beta
Reviewed-on: #435
|
2024-05-21 13:48:42 +02:00 |
Claudio Atzori
|
834461ba26
|
[graph provision]fixed wf definition, revised serialization of the usage counts measures
|
2024-05-21 13:48:06 +02:00 |
Sandro La Bruzzo
|
e8a61d5dd5
|
removed plugin, use only FileGZip plugin
|
2024-05-21 13:45:29 +02:00 |
Sandro La Bruzzo
|
ca9414b737
|
Implement multiple node name splitter on GZipCollectorPlugin and all nodes that use XMLIterator. If the splitter name contains is a comma separated values it splits for all the values
|
2024-05-21 09:11:13 +02:00 |
Sandro La Bruzzo
|
032bcc8279
|
since last beta workflow we decide to introduce in the graph only MAG item with DOI and set them invisible ( this should be the same behaviour of the previous DOIBoost mapping).
This commit apply this type of mapping
|
2024-05-20 09:24:15 +02:00 |
Sandro La Bruzzo
|
103e2652b3
|
merged beta
|
2024-05-17 14:43:07 +02:00 |
Sandro La Bruzzo
|
a87f9ea643
|
fixed scholexplorer bug
|
2024-05-17 14:16:43 +02:00 |
Sandro La Bruzzo
|
6efab4d88e
|
fixed scholexplorer bug
|
2024-05-16 16:19:18 +02:00 |
Claudio Atzori
|
92f018d196
|
[graph provision] fixed path pointing to an intermediate data store in the working directory
|
2024-05-15 15:39:18 +02:00 |
Claudio Atzori
|
0611c81a2f
|
[graph provision] using Qualifier.classNames to populate the correponsing fields in the JSON payload
|
2024-05-15 15:33:10 +02:00 |
Michele Artini
|
2b3b5fe9a1
|
oai finalization and test
|
2024-05-15 14:13:16 +02:00 |
Claudio Atzori
|
1efe7f7e39
|
[graph provision] upgrade to dhp-schema:6.1.2, included project.oamandatepublications in the JSON payload mapping, fixed serialisation of the usageCounts measures
|
2024-05-14 12:39:31 +02:00 |
Claudio Atzori
|
53e7bb4336
|
Merge pull request 'rest-collector-plugin-with-retry' (#432) from rest-collector-plugin-with-retry into beta
Reviewed-on: #432
|
2024-05-10 09:02:33 +02:00 |
Claudio Atzori
|
f7d56e2ef2
|
Merge branch 'beta' into rest-collector-plugin-with-retry
|
2024-05-10 09:02:21 +02:00 |
Claudio Atzori
|
c1237ab39e
|
Merge pull request 'Fixes in Graph Provision' (#434) from beta_provision_relation into beta
Reviewed-on: #434
|
2024-05-09 14:15:05 +02:00 |
Claudio Atzori
|
dc3a5858f7
|
Merge branch 'beta' into beta_provision_relation
|
2024-05-09 14:14:43 +02:00 |
Claudio Atzori
|
55f39f7850
|
[graph provision] adds the possibility to validate the XML records before storing them via the validateXML parameter
|
2024-05-09 14:06:04 +02:00 |
Claudio Atzori
|
39a2afe8b5
|
[graph provision] fixed XML serialization of the usage counts measures, renamed workflow actions to better reflect their role
|
2024-05-09 13:54:42 +02:00 |
Claudio Atzori
|
908ed9da7a
|
Merge pull request 'Various fixes in the stats wf' (#430) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #430
|
2024-05-08 13:41:02 +02:00 |
Antonis Lempesis
|
0cada3cc8f
|
every step is run in the analytics queue. Hardcoded for now, will make a parameter later
|
2024-05-08 13:42:53 +03:00 |
Antonis Lempesis
|
90a4fb3547
|
fixed typos
|
2024-05-08 13:17:58 +03:00 |
Claudio Atzori
|
18aa323ee9
|
cleanup unused classes, adjustments in the oozie wf definition
|
2024-05-08 11:36:46 +02:00 |
Michele Artini
|
c9a327bc50
|
refactoring of gzip method
|
2024-05-08 11:34:08 +02:00 |
Michele Artini
|
e234848af8
|
oaf record: xpath for root
|
2024-05-08 10:00:53 +02:00 |
Claudio Atzori
|
b4e3389432
|
fixed property mapping creating the RelatedEntity transient objects. spark cores & memory adjustments. Code formatting
|
2024-05-07 16:25:17 +02:00 |
Giambattista Bloisi
|
711048ceed
|
PrepareRelationsJob rewritten to use Spark Dataframe API and Windowing functions
|
2024-05-07 15:44:33 +02:00 |
Michele Artini
|
70bf6ac415
|
oai exporter tests
|
2024-05-07 09:36:26 +02:00 |
Michele Artini
|
aa40e53c19
|
oai exporter parameters
|
2024-05-07 08:01:19 +02:00 |
Michele Artini
|
ed052a3476
|
job for the population of the oai database
|
2024-05-06 16:08:33 +02:00 |
Claudio Atzori
|
26363060ed
|
fixed id prefix creation for the fosnodoi records, again
|
2024-05-03 15:53:52 +02:00 |
Claudio Atzori
|
0486227185
|
[cleaning] deactivating the cleaning of FOS subjects found in the metadata provided by repositories
|
2024-05-03 14:31:12 +02:00 |
Claudio Atzori
|
a5d13d5d27
|
code formatting
|
2024-05-03 14:14:34 +02:00 |
Claudio Atzori
|
e1a0fb8933
|
fixed id prefix creation for the fosnodoi records
|
2024-05-03 14:14:18 +02:00 |
Giambattista Bloisi
|
69c5efbd8b
|
Fix: when applying enrichments with no instance information the resulting merge entity was generated with no instance instead of keeping the original information
|
2024-05-03 13:57:56 +02:00 |
Sandro La Bruzzo
|
db358ad0d2
|
code formatted
|
2024-05-02 15:25:57 +02:00 |
Sandro La Bruzzo
|
26bf8e763a
|
merged from beta
|
2024-05-02 15:20:23 +02:00 |
Sandro La Bruzzo
|
a860c57bbc
|
updated .gitignore
|
2024-05-02 15:16:00 +02:00 |
Sandro La Bruzzo
|
0646d0d064
|
Updated main sparkApplication to avoid to require master variable
|
2024-05-02 15:15:03 +02:00 |
Claudio Atzori
|
00ad21d814
|
Merge pull request 'preparations for dhp-common beta release 1.2.5' (#433) from beta-release-1.2.5 into beta
Reviewed-on: #433
|
2024-05-02 11:28:19 +02:00 |
Claudio Atzori
|
4355f64810
|
reverted to version 1.2.5-SNAPSHOT
|
2024-05-02 11:23:53 +02:00 |
Claudio Atzori
|
66680b8b9a
|
refactoring of common utilities
|
2024-05-02 11:16:58 +02:00 |
Claudio Atzori
|
dcf23b3d06
|
Merge branch 'beta' into beta-release-1.2.5
|
2024-05-02 10:01:49 +02:00 |
Michele Artini
|
f4068de298
|
code reindent + tests
|
2024-05-02 09:51:33 +02:00 |
Claudio Atzori
|
11bd89e132
|
[enrichment] use sparkExecutorMemory to define also the memoryOverhead
|
2024-05-01 08:32:59 +02:00 |
Claudio Atzori
|
e96c2c1606
|
[ranking wf] set spark.executor.memoryOverhead to fine tune the resource consumption
|
2024-04-30 16:23:25 +02:00 |
Claudio Atzori
|
50c18f7a0b
|
[dedup wf] revised memory settings to address the increased volume of input contents
|
2024-04-30 12:34:16 +02:00 |
Michele Artini
|
2615136efc
|
added a retry mechanism
|
2024-04-30 11:58:42 +02:00 |
Sandro La Bruzzo
|
133ead1e3e
|
updated new version of scholexplorer Generation
|
2024-04-29 09:00:30 +02:00 |
Sandro La Bruzzo
|
052c6aac9d
|
formatted code
|
2024-04-26 16:03:04 +02:00 |
Sandro La Bruzzo
|
9cd3bc0f10
|
Added a new generation of the dump for scholexplorer tested with last version of spark, and strongly refactored
|
2024-04-26 16:02:07 +02:00 |