Miriam Baglioni
0794e0667b
Merge branch 'doidoost_dismiss' of https://code-repo.d4science.org/D-Net/dnet-hadoop into doidoost_dismiss
2024-04-04 09:16:18 +02:00
Miriam Baglioni
4b1de076ac
[DataciteHostedByMap] added entry for EBRAINS
2024-04-04 09:16:14 +02:00
Miriam Baglioni
c8a88b2187
[DataciteHostedByMap] added entry for EBRAINS
2024-04-04 09:14:58 +02:00
Sandro La Bruzzo
31e152d2bb
Merge remote-tracking branch 'origin/doidoost_dismiss' into doidoost_dismiss
2024-04-03 17:08:35 +02:00
Sandro La Bruzzo
6f3e925cae
Implemented first part of the new MAG mapping
2024-04-03 17:07:14 +02:00
Miriam Baglioni
f0f6abf892
[MapToFunderLink]added references for HFRI and Erasmus+ for the creation of links for funders
2024-04-03 14:59:09 +02:00
Claudio Atzori
26b97aa5ed
Merge pull request '[BETA] fixed the result_country definition and updated the stats DB copy procedure' ( #416 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: D-Net/dnet-hadoop#416
2024-04-03 12:36:03 +02:00
Lampros Smyrnaios
b7c8acc563
- Update the code which acquires the "IMPALA_HDFS_NODE", to test the "tmp"-dir, instead of the base-dir and introduce retries, to overcome potential file-system failures. This change was suggested by "Sebastian Tymkow" and "Grzegorz Bakalarski".
...
- Fix typos.
2024-04-03 13:15:37 +03:00
Miriam Baglioni
50fbebf186
[NOAMI] removed entry for Health and Social Care Board from the list of funders. Modified IRC putting 1596 and 1597 as synonyms, as required in ticket 9635
2024-04-03 11:45:40 +02:00
Michele Artini
71d6e02886
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta
2024-04-03 09:50:41 +02:00
Michele Artini
02c9a311c8
base datainfo with trust=0.89
2024-04-03 09:50:21 +02:00
Miriam Baglioni
42846d3b91
[OpenCitation] add compression option when writing the sequence file
2024-04-03 09:25:00 +02:00
Miriam Baglioni
4f0a044245
Merge pull request 'Add action set creation for Datacite affiliations' ( #413 ) from 9647_datacite_affiliations into beta
...
Reviewed-on: D-Net/dnet-hadoop#413
2024-04-02 17:33:38 +02:00
Serafeim Chatzopoulos
cbe13a5c61
Fix datacite input path in properties file
2024-04-02 18:00:35 +03:00
Miriam Baglioni
9c9a9562ae
[UsageCount] fixed error
2024-04-02 16:56:37 +02:00
Miriam Baglioni
b42bdd5fb3
[UsageCount] add check in case the datasource is not matched against those present in the graph
2024-04-02 16:28:27 +02:00
Miriam Baglioni
64cbd8abe9
Merge pull request '[UsageCount] Usage count per result split by datasource' ( #318 ) from UsageStatsRecordDS into beta
...
Reviewed-on: D-Net/dnet-hadoop#318
2024-04-02 10:21:39 +02:00
Antonis Lempesis
df6e3bda04
added new orgs in monitor
2024-04-01 22:45:29 +03:00
Antonis Lempesis
573b081f1d
added new orgs in monitor
2024-04-01 22:24:46 +03:00
Serafeim Chatzopoulos
0eb0701b26
Add action set creation for Datacite affiliations
2024-04-01 17:23:26 +03:00
Antonis Lempesis
0bf2a7a359
fixed the result_country definition
2024-04-01 15:23:22 +03:00
Claudio Atzori
24227ab598
Merge pull request '[BETA] fixed typo in indicator query' ( #411 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: D-Net/dnet-hadoop#411
2024-03-27 13:56:43 +01:00
Antonis Lempesis
9ff44eed96
fixed typo in indicator query
...
added more institutions
2024-03-27 14:39:01 +02:00
Claudio Atzori
cff6040424
Merge pull request '[BETA] added missing EOS, Generate tables with parquet-files, instead of csv in the contexts.sh script' ( #409 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: D-Net/dnet-hadoop#409
2024-03-27 12:04:04 +01:00
Antonis Lempesis
1fee4124e0
added missing EOS
2024-03-27 12:58:25 +02:00
Sandro La Bruzzo
73a67c0e4a
Improved Crossref mapping to include also unpaywall tested
2024-03-26 17:26:47 +01:00
Claudio Atzori
75551ad4ec
code formatting
2024-03-26 14:53:16 +01:00
Miriam Baglioni
94b931f7bd
[BulkTagging - tag datasource and projects]merging with branch beta
2024-03-26 14:25:19 +01:00
Miriam Baglioni
3b209261f2
[BulkTagging - tag datasource and projects]merging with branch beta
2024-03-26 14:21:27 +01:00
Lampros Smyrnaios
036ba03fcd
Generate tables with parquet-files, instead of csv, in "dhp-stats-update/.../contexts.sh" script.
2024-03-26 13:29:04 +02:00
Claudio Atzori
730eaffc85
Merge pull request 'correctly selecting the active hdfs node for the impala cluster' ( #405 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: D-Net/dnet-hadoop#405
2024-03-26 12:07:46 +01:00
Lampros Smyrnaios
bc8c97182d
Automatically select the ACTIVE HDFS NODE for Impala cluster, in all "copyDataToImpalaCluster.sh" scripts.
2024-03-26 13:01:12 +02:00
Lampros Smyrnaios
92cc27e7eb
Use the ACTIVE HDFS NODE for Impala cluster, in "copyDataToImpalaCluster.sh" script.
2024-03-26 12:34:11 +02:00
Claudio Atzori
ef52128c55
included new stats* workflows in parent pom list of modules, code formatting
2024-03-26 10:42:10 +01:00
Claudio Atzori
bfba71a95c
further follow up changes from integrating the mergeutils branch
2024-03-26 09:01:18 +01:00
Claudio Atzori
d72e7b7487
Merge pull request 'Changes to indicators and funders definition' ( #372 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: D-Net/dnet-hadoop#372
2024-03-26 08:46:20 +01:00
Sandro La Bruzzo
ece56f0178
update crossref mapping to be transformed together with UnpayWall
2024-03-25 18:18:10 +01:00
Claudio Atzori
538b180fe0
Merge branch 'beta' into oaf_country_beta
2024-03-25 16:13:20 +01:00
Claudio Atzori
82fc609c4f
Merge branch 'beta' into index_records
2024-03-25 16:12:49 +01:00
Claudio Atzori
74e5d05577
Merge branch 'beta' into ocnew
2024-03-25 16:10:31 +01:00
Claudio Atzori
6c3b692f60
integrated minor change from beta branch
2024-03-25 16:10:23 +01:00
Claudio Atzori
9a5b134ddf
Merge branch 'beta' into FOSNew
2024-03-25 16:07:37 +01:00
Claudio Atzori
71c1f81b54
Merge branch 'beta' into exception_on_invalid_transofmation_rule
2024-03-25 16:05:11 +01:00
Claudio Atzori
91b61687fa
Merge branch 'beta' into bulkTaggingPathMapExtention
2024-03-25 15:50:18 +01:00
Claudio Atzori
54936b7f42
Merge branch 'beta' into transformativeagreement
2024-03-25 15:42:22 +01:00
Michele Artini
e1149eb5c4
xslt rules and tests
2024-03-25 15:01:42 +01:00
Michele Artini
3f174ad90f
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta
2024-03-25 12:16:02 +01:00
Michele Artini
6ffb1faf09
fixed a problem with multiple nodes
2024-03-25 12:15:51 +01:00
Giambattista Bloisi
3f22c101d9
Merge pull request 'Enrich authors with ORCID info using new matching algorithm' ( #398 ) from new_orcid_enhancement into beta
...
Reviewed-on: D-Net/dnet-hadoop#398
2024-03-22 17:29:20 +01:00
Giambattista Bloisi
0ff7faad72
Fix conditions that prevented ORCID Enrichment
2024-03-22 16:24:49 +01:00
Michele Artini
7faa115ba0
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta
2024-03-22 11:08:59 +01:00
Michele Artini
f9c74c98fa
fixed an identifier xpath
2024-03-22 11:08:45 +01:00
Antonis Lempesis
4c40c96e30
code cleanup
2024-03-22 10:16:49 +02:00
Antonis Lempesis
459167ac2f
Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta
2024-03-21 12:44:58 +02:00
Antonis Lempesis
07f634a46d
code cleanup
2024-03-21 12:44:30 +02:00
Antonis Lempesis
9521625a07
code cleanup
2024-03-21 11:45:08 +02:00
Sandro La Bruzzo
58dbe71d39
update crossref mapping to be runnable separately as a single datasource outside doiboost
2024-03-20 17:04:52 +01:00
Antonis Lempesis
67a5aa0a38
Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta
2024-03-19 11:24:54 +02:00
dimitrispie
a3a570e9a0
Commit monitor-updates-wf
2024-03-19 09:42:21 +02:00
Giambattista Bloisi
664a381d31
Unify merge logic of entities in MergeUtils.class
2024-03-18 16:04:49 +01:00
Michele Artini
cb29b9773c
xslt rules
2024-03-18 15:31:34 +01:00
Michele Artini
85b844d57e
updated BASE filter param
2024-03-15 15:03:27 +01:00
Michele Artini
455f2e1e07
apply commits from master
2024-03-15 14:56:39 +01:00
Michele Artini
30167aa882
mapped oaf:country from results
2024-03-15 11:24:16 +01:00
Michele Artini
88fef367b9
new plugin to collect from a dump of BASE
2024-03-15 10:47:52 +01:00
Claudio Atzori
078169b922
cleanup
2024-03-15 09:56:04 +01:00
Claudio Atzori
af154d4456
implemented changes from #9497 : sort abstracts by string length, included author fullnames in the related results, expanded instance details within each children/result XML element
2024-03-14 16:21:23 +01:00
Claudio Atzori
7863c92466
expanded paper abstract in the result/children XML element (ticket #9497 )
2024-03-13 16:25:31 +01:00
Claudio Atzori
eb5887cb9a
including related organization url in the XML record serialization (ticket #9498 )
2024-03-13 14:46:00 +01:00
Sandro La Bruzzo
5281f010a5
applied cherry pick
2024-03-13 09:59:20 +01:00
Sandro La Bruzzo
ee1fcb672b
code refactor
2024-03-13 09:46:31 +01:00
Miriam Baglioni
5a32bb9578
[OC New] last fix
2024-03-13 09:36:18 +01:00
Sandro La Bruzzo
c532831718
Moved Crossref Mapping on dhp-aggregations,
...
refactored code, avoid to use utility for create part of the oaf defined in DOIBoostMappingUtils, used instead utility in OafMappingUtils
2024-03-13 06:56:10 +01:00
Miriam Baglioni
48c052215c
[OC New] last fix
2024-03-12 23:12:32 +01:00
Claudio Atzori
db66555ebb
WIP: updated provision workflow to create a JSON based representation of the payload
2024-03-12 09:56:09 +01:00
Antonis Lempesis
f74c7e8689
selecting distinct peer_reviewed
2024-03-12 02:13:04 +02:00
Giambattista Bloisi
9092075760
Enrich authors with ORCID info using new matching algorithm
2024-03-11 13:23:59 +01:00
Sandro La Bruzzo
cbd4e5e4bb
update mag mapping
2024-03-08 16:31:40 +01:00
Claudio Atzori
d4871b31e8
WIP: extended provision workflow to create the JSON based payload
2024-03-08 11:43:20 +01:00
Antonis Lempesis
3c79720342
fixed the irish result subset
2024-03-07 14:08:57 +02:00
Antonis Lempesis
5ae4b4286c
Merge branch 'beta' of https://code-repo.d3science.org/antonis.lempesis/dnet-hadoop into beta
2024-03-07 12:15:19 +02:00
Miriam Baglioni
5180b6ec8a
[FOSNEW] removed test class
2024-03-07 10:47:13 +01:00
Miriam Baglioni
7827a2d66b
[OCNEW] added creation of the actionset for the results classified with FoS based ont he OpenAIRE identifier
2024-03-07 10:36:30 +01:00
Antonis Lempesis
316d585c8a
using distinct apcs per publication to avoid huge sums
2024-03-07 02:07:59 +02:00
Miriam Baglioni
fd34372c40
[OCNEW] first implementation
2024-03-06 13:42:00 +01:00
Sandro La Bruzzo
d34cef3f8d
Merge remote-tracking branch 'origin/beta' into doidoost_dismiss
2024-03-05 11:45:31 +01:00
Sandro La Bruzzo
3b837d38ce
added oozie workflow
2024-03-05 11:44:59 +01:00
Sandro La Bruzzo
f417515e43
Implemented class that generates a normalized table of MAG, which is the starting point for the creation of the mag source
2024-03-04 17:15:13 +01:00
Sandro La Bruzzo
ad0e9aa80c
added first part of refactoring of the code generating MAG,
...
make it more readable using spark sql queries
2024-02-29 18:16:15 +01:00
Sandro La Bruzzo
9d94648f3b
code formatted
2024-02-29 18:15:20 +01:00
Giambattista Bloisi
3cd5590f3b
When converting json to XML, remove characters that are not allowed in the XML 1.0 specs, as they will cause xpath failures even if escaped
2024-02-28 15:14:18 +01:00
Giambattista Bloisi
56dd05f85c
Merge pull request 'Revised procedure when converting json data into xml' ( #395 ) from restiterator_xmlcleanup into beta
...
Reviewed-on: D-Net/dnet-hadoop#395
2024-02-28 10:38:54 +01:00
Claudio Atzori
6fcf872daa
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into index_records
2024-02-28 10:27:28 +01:00
Claudio Atzori
3f07390a58
WIP
2024-02-28 10:10:10 +01:00
Sandro La Bruzzo
7d806a434c
formatted code
2024-02-28 09:31:58 +01:00
Sandro La Bruzzo
b63994dcc4
Merge remote-tracking branch 'origin/beta' into orcid_update
2024-02-28 09:11:18 +01:00
Sandro La Bruzzo
915a76a796
following the comment on the pull requests:
...
- Added #NUM_OF_THREADS complete job in the queue at the end of the main loop to avoid deadlock
2024-02-28 09:10:55 +01:00
Giambattista Bloisi
773e856550
Revised procedure when converting json data into xml:
...
- json object keys are renamed to be conformant to xml tag elements, special characters are substituted or removed
- json string values are no longer post-processed as they are already escaped by the org.json.XML.toString method
2024-02-24 16:54:30 +01:00
Sandro La Bruzzo
a712df1e1d
Merge remote-tracking branch 'origin/beta' into orcid_update
2024-02-23 10:12:25 +01:00
Sandro La Bruzzo
b32a9d1994
Implemented workflow for updating table , added step to check if the new generated table is valid
2024-02-23 10:04:28 +01:00