Commit Graph

948 Commits

Author SHA1 Message Date
Claudio Atzori c42850328e fixed semantic (subreltype) for ServiceOrganization relations 2022-09-22 16:23:25 +02:00
Claudio Atzori 26e1badded added instance.url syntactical validation, avoid creating multiple duplicated URLs 2022-09-19 11:19:10 +02:00
Claudio Atzori 192215a18e merged from branch discard-non-wellformed 2022-09-19 10:17:10 +02:00
Claudio Atzori e370e940d8 [aggregator graph] save invalid records aside for further inspection 2022-09-16 14:06:28 +02:00
Claudio Atzori 1e42d984e1 [aggregator graph] save invalid records aside for further inspection 2022-09-15 10:49:42 +02:00
Claudio Atzori c48f6e9c57 [aggregator graph] save invalid records aside for further inspection 2022-09-14 17:11:26 +02:00
Claudio Atzori a0919ed495 [aggregator graph] save invalid records aside for further inspection 2022-09-14 13:27:39 +02:00
Alessia Bardi b99a011345 return empty Oaf list if record cannot be parsed 2022-09-13 11:51:55 +02:00
Alessia Bardi 27af5122d2 logs for non well formed XML files 2022-09-12 14:25:23 +02:00
Claudio Atzori ff6f789b6d code formatting 2022-09-09 15:16:31 +02:00
Claudio Atzori b5f7bd30be Merge branch 'beta' into clean_subjects 2022-09-09 12:20:04 +02:00
Alessia Bardi a539c6ccaf https for handle URLs 2022-09-09 12:16:28 +02:00
Alessia Bardi 9ef063d502 #7861#note-8 instance url from handle 2022-09-07 17:29:54 +03:00
Claudio Atzori adb526b0e1 Merge branch 'beta' into clean_subjects 2022-08-12 10:51:17 +02:00
Claudio Atzori cb7c07c54e [scholix] added step to create tar archive 2022-08-11 11:25:24 +02:00
Claudio Atzori 2aa16d0432 [scholix] fixed OpenCitation dump procedure 2022-08-10 17:39:29 +02:00
Claudio Atzori 51ad93e545 [scholix] fixed OpenCitation dump procedure 2022-08-10 11:57:56 +02:00
Claudio Atzori 3418ce50ac cleaning of subjects: perform the cleaning when the given value is equivalent to one of the terms in the vocabulary 2022-08-08 12:48:47 +02:00
Claudio Atzori 4eaa063b1f cleaning of subjects 2022-08-05 16:56:09 +02:00
Claudio Atzori 32cee1f619 WIP: cleaning of subjects 2022-08-05 12:32:08 +02:00
Claudio Atzori 6c0fd9284b merge from beta 2022-08-05 10:42:53 +02:00
Claudio Atzori b78889a0ce WIP: cleaning of subjects 2022-08-05 09:11:37 +02:00
Miriam Baglioni a7a18d7630 [Graph Dump] removed code for the dump from the project. Fixed issues in tests when possible 2022-08-04 17:40:40 +02:00
Claudio Atzori 27a91841e7 WIP: cleaning of subjects 2022-08-04 11:39:39 +02:00
Claudio Atzori 1dd1e4fe3a extended test for mapping project_organization relations 2022-07-28 11:27:08 +02:00
Claudio Atzori 09ccc7b472 Merge branch 'beta' into project_organization_contribution 2022-07-28 09:49:59 +02:00
Sandro La Bruzzo ddc414b258 fixed wrong json param 2022-07-22 09:43:15 +02:00
Sandro La Bruzzo 5f651f2316 changed filter relation on SubRelType 2022-07-21 10:11:48 +02:00
Sandro La Bruzzo 5b76321d9c implemented oozie workflow to generate scholix dump filtering relclass semantic 2022-07-20 16:34:32 +02:00
Claudio Atzori 1138b2ac8e code formatting 2022-07-19 14:15:49 +02:00
Claudio Atzori 0c1cfee396 mapping oaf:fulltext elements in the result.fulltext field 2022-07-11 17:34:59 +02:00
Claudio Atzori 0cb1c70788 code formatting 2022-07-01 10:44:08 +02:00
Claudio Atzori 4ec13e2b66 Merge branch 'master' into dump_new_funded_products 2022-07-01 10:30:28 +02:00
Claudio Atzori 7da24c1dec added more logging 2022-06-28 13:47:49 +02:00
Miriam Baglioni 71744a1f52 [DUMP DELTA PROJECTS] refactoring 2022-06-27 18:07:58 +02:00
Claudio Atzori a8773af0cb Merge branch 'beta' into project_organization_contribution 2022-06-27 09:37:40 +02:00
Claudio Atzori 5130eac247 mapping by participant project contribution 2022-06-24 17:16:42 +02:00
Miriam Baglioni f561f13dd9 [Funder Products Dump] fixed names of parameters in workflow 2022-06-21 18:18:17 +02:00
Miriam Baglioni b98f904d48 [Funder Products Dump] new way to avoid using hive 2022-06-21 17:52:27 +02:00
Miriam Baglioni 7423577a08 [Graph DUMP] add code to produce the delta of new projects with respect to the previous delta/dump 2022-06-21 14:51:38 +02:00
Claudio Atzori b295a40d9c restored use of name_particles when parsing author names 2022-06-16 12:20:43 +02:00
Claudio Atzori 4c8e820ff0 mapping relationship from trasformed records based on oaf:relation 2022-06-14 08:49:02 +02:00
Claudio Atzori 116902c028 mapping relationship from trasformed records based on oaf:relation 2022-06-13 14:31:48 +02:00
Claudio Atzori 52cb086506 [graph grouping] drop relation target path before copying from source 2022-05-16 12:08:36 +02:00
Claudio Atzori 997c50078e [graph grouping] drop relation target path before copying from source 2022-05-16 12:07:40 +02:00
Claudio Atzori 6031acb2e3 [openorgs] fixed parent/child query, using the correct semantic labels 2022-05-16 09:20:48 +02:00
Claudio Atzori 0dc33ea391 [openorgs] fixed parent/child query, using the correct semantic labels 2022-05-16 09:20:30 +02:00
Miriam Baglioni e4eac1d20b [EOSC TAG] added code to remove EOSC Jupyter Notebook from subjects and put EOSC as classid in the qualifier 2022-05-13 11:01:33 +02:00
Sandro La Bruzzo 22f65680b9 Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta 2022-05-11 15:30:12 +02:00
Sandro La Bruzzo ca8d26bcb4 added better filter for openCitations 2022-05-11 15:29:57 +02:00