Giambattista Bloisi
664a381d31
Unify merge logic of entities in MergeUtils.class
2024-03-18 16:04:49 +01:00
Giambattista Bloisi
e64c2854a3
Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface
...
JsonPath cache contention fixed by using a ConcurrentHashMap
Blacklist filtering performance improvement
Minor performance improvements when evaluating similarity
Sorting in clustered elements is deterministic (by ordering and identity field, instead of ordering field only)
2023-07-24 15:36:24 +02:00
Sandro La Bruzzo
2b9a20a4a3
Changed the way Scholexplorer filter the relationships, I found that filter all relation coming from openCitation is wrong, because we loose a lot of relation than intersect OpenCitation, but they don't come only from there
2022-10-24 12:53:47 +02:00
Claudio Atzori
2aa16d0432
[scholix] fixed OpenCitation dump procedure
2022-08-10 17:39:29 +02:00
Claudio Atzori
51ad93e545
[scholix] fixed OpenCitation dump procedure
2022-08-10 11:57:56 +02:00
Sandro La Bruzzo
5f651f2316
changed filter relation on SubRelType
2022-07-21 10:11:48 +02:00
Sandro La Bruzzo
5b76321d9c
implemented oozie workflow to generate scholix dump filtering relclass semantic
2022-07-20 16:34:32 +02:00
Miriam Baglioni
e4eac1d20b
[EOSC TAG] added code to remove EOSC Jupyter Notebook from subjects and put EOSC as classid in the qualifier
2022-05-13 11:01:33 +02:00
Sandro La Bruzzo
ca8d26bcb4
added better filter for openCitations
2022-05-11 15:29:57 +02:00
Sandro La Bruzzo
57e2c4b749
formatted code
2022-01-12 09:40:28 +01:00
Claudio Atzori
4f212652ca
scalafmt: code formatting
2022-01-11 16:57:48 +01:00
Sandro La Bruzzo
3920d68992
Fixed workflow generation of delta in datacite
2021-12-21 11:41:49 +01:00
Sandro La Bruzzo
b881ee5ef8
[scholexplorer]
...
- implemented generation of scholix of delta update of datacite
2021-12-15 11:25:32 +01:00
Sandro La Bruzzo
63952018c0
[scholexplorer]
...
-moved SparkRetrieveDataciteDelta in scala folder
2021-12-15 11:25:32 +01:00
Sandro La Bruzzo
e5bff64f2e
[scholexplorer]
...
- Minor fix on SparkConvertRDDtoDataset
-first implementation of retrieve datacite dump
2021-12-15 11:25:32 +01:00
Sandro La Bruzzo
bf880e2508
[scala-refactor] Module dhp-graph-mapper:
...
Moved all scala source into src/main/scala and src/test/scala
2021-12-06 13:57:41 +01:00