miconis
f446580e9f
code refactoring (useless classes and wf removed), implementation of the test for the openorgs dedup
2021-03-29 16:10:46 +02:00
miconis
2355cc4e9b
minor changes and bug fix
2021-03-29 10:07:12 +02:00
miconis
28c1cdd132
merged stable_ids into openorgswf
2021-03-25 10:44:49 +01:00
miconis
5dfb66b0fa
minor changes
2021-03-25 10:29:34 +01:00
miconis
348b0ef921
bug fix, implementation of the workflow for the creation of raw_organizations (openorgs dedup), addition of the pid lists to the openorgs postgres db
2021-03-24 15:51:27 +01:00
miconis
0fe40b08e4
addition of deduplication profiles for the results, double check on pids and the title with a lower threshold
2021-03-19 17:12:05 +01:00
miconis
98854b0124
minor changes
2021-03-19 16:57:40 +01:00
Claudio Atzori
5a043e95ea
code formatting
2021-03-19 11:37:27 +01:00
Claudio Atzori
a4e82a65aa
integrated filter applied when merging BETA & PROD graphs to rule our records from Datacite
2021-03-19 11:34:44 +01:00
Claudio Atzori
3256b9c836
code formatting
2021-03-19 09:36:12 +01:00
Claudio Atzori
75144dacb3
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
2021-03-19 09:07:40 +01:00
Claudio Atzori
9588bfba81
[cleaning] entries avaialbe as PIDs must not appear as alternateIdentifier
2021-03-19 09:07:30 +01:00
Claudio Atzori
972d5a3d98
[dedup] Datacite should be authoritative for datasets
2021-03-19 09:04:20 +01:00
Sandro La Bruzzo
25d5663d97
added filter
2021-03-18 10:24:42 +01:00
Sandro La Bruzzo
5f98ea74a9
Added fix for pid generation in stableIds
2021-03-17 15:53:24 +01:00
Sandro La Bruzzo
b4805b989d
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
2021-03-17 15:14:59 +01:00
Claudio Atzori
734232d3b9
identifier factory doesn't depend on pre-existing entity.id
2021-03-17 15:14:53 +01:00
Sandro La Bruzzo
76b10090fc
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
2021-03-17 15:06:46 +01:00
Claudio Atzori
a3dac32f16
pidFilter a bit more permissive
2021-03-17 15:06:05 +01:00
Sandro La Bruzzo
2be0428047
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
2021-03-17 14:54:28 +01:00
Claudio Atzori
8257f9a2bc
result.pid: adjusted the mapping applied to the contents from the aggregator
2021-03-17 12:45:38 +01:00
Sandro La Bruzzo
7c97a4d900
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
2021-03-17 12:13:03 +01:00
Sandro La Bruzzo
cc5bbafa5d
some fix to make workflows runs
2021-03-17 12:12:56 +01:00
Claudio Atzori
3b2da86f0a
added precondition on IdentifierFactory to check the presence of entity.id
2021-03-16 17:05:38 +01:00
Claudio Atzori
640b885706
added instance.alternativeIdentifiers to the graph model, adjusted the mapping applied to the contents from the aggregator
2021-03-16 14:19:32 +01:00
Claudio Atzori
9cac6da9bd
added AccessRight and OpenAccessRoute classes to ModelSupport.getOafModelClasses()
2021-03-12 16:31:17 +01:00
Claudio Atzori
d3cb923f24
introduced java8-based date parsing
2021-03-11 12:37:33 +01:00
Sandro La Bruzzo
4bb3bcafa5
add author sequence number
2021-03-11 11:32:32 +01:00
Sandro La Bruzzo
a8e5d0ea0d
updated test and fixed assign of access right
2021-03-11 10:41:24 +01:00
Sandro La Bruzzo
f5e7c57654
Fixed ticket 6282
2021-03-11 10:32:45 +01:00
Claudio Atzori
f74e464942
create bestaccessright as Qualifier
2021-03-10 15:40:05 +01:00
Claudio Atzori
c801ab6c1d
minor
2021-03-09 17:22:31 +01:00
Claudio Atzori
9917d7e01c
PID authorities include ArXiv
2021-03-09 17:12:52 +01:00
Claudio Atzori
01630f638d
IdentifierFactory implementation based on the list of datasources authoritative for a given pid type
2021-03-09 17:11:50 +01:00
Claudio Atzori
b3f3b895e5
[ #6282 open access status in the Graph] OAStatus renamed as openAccessRoute
2021-03-09 11:41:11 +01:00
Claudio Atzori
765f9bdee7
merged from dhp_oaf_model
2021-03-09 11:37:41 +01:00
Claudio Atzori
59532b0919
[ #6281 Provenance of product PIDs] Added PIDs to the Instance type; extended mapping for OAF/ODF records
2021-03-09 11:14:45 +01:00
Claudio Atzori
d525785497
[ #6282 open access status in the Graph] Result.Instance.accessRight defined with dedicated data type that includes the open access color.
2021-03-09 11:12:55 +01:00
Sandro La Bruzzo
bbe1a7c69a
[ #6281 Provenance of product PIDs] Added PIDs to the Instance type in Scholexplorer Export
2021-03-09 10:46:36 +01:00
Sandro La Bruzzo
a2169ccf07
// implemented Ticket #6281 added pid to Instance in doiBoost
2021-03-09 10:46:36 +01:00
Claudio Atzori
f468c7f0d7
merged from master
2021-03-09 09:12:41 +01:00
Claudio Atzori
76441f4edd
code formatting
2021-03-09 09:12:26 +01:00
Claudio Atzori
8d2bb24512
merged from master
2021-03-08 15:44:34 +01:00
Claudio Atzori
e8789b0cdb
Merge pull request 'stats DB for monitor' ( #99 ) from antonis.lempesis/dnet-hadoop:master into master
...
Looks good to me, just a note on the parsing of the citations: since the last version, IIS produces citations as proper relationships among results. This is what we got already in the BETA graph
```
count r.reltype r.subreltype r.relclass
62.129.254 resultResult citation cites
62.043.309 resultResult citation isCitedBy
```
Thus, I suggest to move away from the current property based implementation for the extraction of the citation links and start relying on the relationships instead.
2021-03-03 10:29:09 +01:00
Antonis Lempesis
27796343ca
crude sleep. hardcoded value
2021-03-03 01:37:47 +02:00
miconis
1a85020572
bug fix in graph-mapper, changes in the implementation of the openorgs wf to create relations and populate openorgs db
2021-02-26 10:19:28 +01:00
Antonis Lempesis
d90767c733
correctly invalidating metadata
2021-02-19 03:18:47 +02:00
Antonis Lempesis
3681afbe04
typo
2021-02-19 03:04:27 +02:00
Antonis Lempesis
c5502eba8f
actually moved stats computation in impala instead of hive...
2021-02-19 02:54:39 +02:00
Antonis Lempesis
33c85d4e66
moved stats computation in impala instead of hive
2021-02-18 17:23:34 +02:00