Sandro La Bruzzo
c8ddb527b3
Added unit test to verify the generation in the OriginalID the old openaire Identifier generated by OAI
2022-07-20 15:33:37 +02:00
Sandro La Bruzzo
27fbc9b385
[DHP Schema refactoring]
...
- move the business logic from the model class to dhp-common
2022-07-04 18:22:38 +02:00
Sandro La Bruzzo
e517f52e30
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into doiboost_refactor
2022-06-27 16:24:04 +02:00
Sandro La Bruzzo
8b9f70d977
[DOIBoost Refactor]
...
- Testing :
Added common method to retrieve mock vocabulary in test class
fixed test
- Mapping Crossref:
Using vocabulary IS to mapping the crossref type instead of a map into the code
2022-06-27 16:23:28 +02:00
Claudio Atzori
b295a40d9c
restored use of name_particles when parsing author names
2022-06-16 12:20:43 +02:00
Claudio Atzori
da611cfbbd
[eosc_services] resolved merge conflicts
2022-05-03 13:37:15 +02:00
Claudio Atzori
f5f532d134
EOSC Services - ongoing update
2022-04-29 12:25:24 +02:00
Miriam Baglioni
b61efd613b
[Measures] addressed comments in the PR
2022-04-21 12:09:37 +02:00
Miriam Baglioni
c304657d91
[Measures] put the logic in common, no need to change the schema
2022-04-21 11:27:26 +02:00
Miriam Baglioni
b7c2340952
[HostedByMap - DOIBoost] changed to use code moved to common since used also from hostedbymap now
2022-03-04 11:05:23 +01:00
Claudio Atzori
db299dd8ab
fixed typo
2022-01-27 16:24:06 +01:00
Claudio Atzori
c42623f006
added NPE checks
2022-01-21 14:30:09 +01:00
Claudio Atzori
391aa1373b
added unit test
2022-01-19 17:13:21 +01:00
Claudio Atzori
62f135262e
code formatting
2022-01-19 12:30:52 +01:00
Claudio Atzori
44a937f4ed
factored out entity grouping implementation, extended to consider results from delegated authorities rather than identical records from other sources
2022-01-19 12:24:52 +01:00
Miriam Baglioni
42e8f76778
[GraphCleaning] change the return value in the filtering function to avoid to lose the APC entities
2022-01-13 16:06:43 +01:00
Miriam Baglioni
56409d1281
[Dump] resolved conflicts with beta and merging
2021-12-14 15:03:45 +01:00
Miriam Baglioni
a3592b463a
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2021-12-14 14:58:26 +01:00
Claudio Atzori
aff3ddc8d2
added cleaning for the format field, removing carrige return and tab characters
2021-12-14 11:41:46 +01:00
Miriam Baglioni
936578aaf1
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
2021-12-13 15:01:47 +01:00
Claudio Atzori
41c70c607d
cleaning workflow assigns the proper default instance type when a value could not be cleaned using the vocabularies
2021-12-09 16:44:28 +01:00
Claudio Atzori
e6e177dda0
vocabulary based cleaning considers also the term label when looking up for a synonym
2021-12-09 13:57:53 +01:00
Miriam Baglioni
b113586207
resolved conflicts
2021-12-07 10:16:14 +01:00
Miriam Baglioni
96a7d46278
[Graph Dump] fixed tests
2021-12-06 15:06:32 +01:00
Sandro La Bruzzo
81bf604059
[scala-refactor] Module dhp-common:
...
Moved all scala source into src/main/scala and src/test/scala
2021-12-06 11:29:24 +01:00
Claudio Atzori
863a2f9db3
avoid to filter OAF records defined as invisible = true
2021-12-03 09:08:12 +01:00
Miriam Baglioni
8905a39bf3
mergin with branch beta
2021-12-02 13:17:29 +01:00
Sandro La Bruzzo
1e1f5e4fe0
minor fix
2021-11-25 13:03:17 +01:00
Sandro La Bruzzo
2164a2a889
Datacite: Code Refactor generated a general SparkApplication Scala where all the spark scala have to inherit
...
Commented a little the Datacite transformation code
2021-11-25 10:54:13 +01:00
Miriam Baglioni
9fae872181
[Graph Dump] changed to mirror the changes in the model
2021-11-19 11:25:50 +01:00
Claudio Atzori
82a4e4efae
[cleaning wf] fixed methodology to rule out invalid result titles, based on https://support.openaire.eu/issues/7206
2021-11-17 14:17:22 +01:00
Claudio Atzori
49f897ef29
[cleaning wf] fixed regex used to spot garbage in result titles; adjusted threshold for filtering titles
2021-11-16 15:24:23 +01:00
Sandro La Bruzzo
aafdffa6b3
resolved conflict
2021-10-26 09:45:46 +02:00
Sandro La Bruzzo
034304b33a
conflict resolved on merge
2021-10-26 09:40:47 +02:00
Claudio Atzori
6b34ba737e
minor
2021-10-21 14:16:18 +02:00
Sandro La Bruzzo
ae4e99a471
Adapted workflow of resolution of PID to work into OpenAIRE data workflow
...
- Added relations in both verse on all Scholexplorer datasources
2021-10-20 17:12:16 +02:00
Miriam Baglioni
c8321ad31a
merge with branch beta
2021-10-01 12:59:08 +02:00
Claudio Atzori
663b1556d7
manually integrating PR#140 #140
2021-09-15 16:40:25 +02:00
Claudio Atzori
3359f73fcf
cleanup & best practices
2021-08-13 12:00:42 +02:00
Miriam Baglioni
6e84b3951f
GetCSV refactoring - moving classes to dhp-common that have dependency with GetCSV class (that was located in graph-mapper)
2021-08-12 17:57:41 +02:00
Claudio Atzori
2ee21da43b
suggestions from SonarLint
2021-08-11 12:13:22 +02:00
Miriam Baglioni
6bd1eca7e0
merge branch with beta
2021-08-05 15:23:32 +02:00
Miriam Baglioni
ee13da9258
merge branch with master
2021-08-05 11:34:20 +02:00
Claudio Atzori
a9961a1835
[cleaning] title cleaning based on the me.xuender:unidecode library
2021-07-28 16:36:33 +02:00
Claudio Atzori
6dddad86ee
[cleaning] title cleaning based on the me.xuender:unidecode library
2021-07-28 16:21:29 +02:00
Miriam Baglioni
35e395eae8
merge with master
2021-07-27 12:34:59 +02:00
Claudio Atzori
bc835d2024
[cleaning] fixed filtering function for missing titles
2021-07-23 11:56:13 +02:00
Claudio Atzori
ffdb2a3ea3
[cleaning] fixed filtering function for missing titles
2021-07-23 11:55:55 +02:00
Sandro La Bruzzo
62ae36a3d2
fixed NPE
2021-07-22 15:41:38 +02:00
Sandro La Bruzzo
d94565862a
fixed NPE
2021-07-21 21:23:11 +02:00