dump of the results related to at least one project #61
No reviewers
Labels
No Label
bug
duplicate
enhancement
help wanted
invalid
question
RDGraph
RSAC
wontfix
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: D-Net/dnet-hadoop#61
Loading…
Reference in New Issue
No description provided.
Delete Branch "miriam.baglioni/dnet-hadoop:dump"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This PR is related to the dump for results related to funders. It dumps following the model of the ResearchCommunity result dump, the results funded by at least one project. It splits them by funder and associates to each funder the set of results it had funded.
Main modification:
the semantics used to extract the link to projects has changed: "isProducedBy" instead of "produces". This because the relationship with relType "outcome" do not verify bidirectionality
the type of dump to be made on the result has changed from boolean to {"complete","community","funder"}
@ -93,3 +93,1 @@
if (name.trim().equalsIgnoreCase("communities_infrastructures")) {
name = "communities_infrastructures.json";
}
// if (name.trim().equalsIgnoreCase("communities_infrastructures")) {
please clean up unused code
done
@ -26,6 +26,8 @@ public class Constants {
public static String ORCID = "orcid";
public static String RESULT_PROJECT_IS_PRODUCED_BY = "isProducedBy";
Please avoid duplicating the constants. I think you can refer to https://code-repo.d4science.org/D-Net/dnet-hadoop/src/branch/master/dhp-schemas/src/main/java/eu/dnetlib/dhp/schema/common/ModelConstants.java#L54
ModelConstants does not contain ORCID. I can add it there instead. Ok for IS_PRODUCED_BY
@ -0,0 +274,4 @@
--conf spark.sql.warehouse.dir=${sparkSqlWarehouseDir}
</spark-opts>
<arg>--sourcePath</arg><arg>${workingDir}/result/publication</arg>
<!-- <arg>--sourcePath</arg><arg>${sourcePath}/publication</arg>-->
cleanup commented definitions, please
done
@ -0,0 +302,4 @@
--conf spark.sql.warehouse.dir=${sparkSqlWarehouseDir}
</spark-opts>
<arg>--sourcePath</arg><arg>${workingDir}/result/dataset</arg>
<!-- <arg>--sourcePath</arg><arg>${sourcePath}/dataset</arg>-->
cleanup comments, please
@ -0,0 +330,4 @@
--conf spark.sql.warehouse.dir=${sparkSqlWarehouseDir}
</spark-opts>
<arg>--sourcePath</arg><arg>${workingDir}/result/otherresearchproduct</arg>
<!-- <arg>--sourcePath</arg><arg>${sourcePath}/otherresearchproduct</arg>-->
yet another comment to cleanup
@ -0,0 +358,4 @@
--conf spark.sql.warehouse.dir=${sparkSqlWarehouseDir}
</spark-opts>
<arg>--sourcePath</arg><arg>${workingDir}/result/software</arg>
<!-- <arg>--sourcePath</arg><arg>${sourcePath}/software</arg>-->
remove, please!
@ -0,0 +532,4 @@
<main-class>eu.dnetlib.dhp.oa.graph.dump.MakeTar</main-class>
<arg>--hdfsPath</arg><arg>${outputPath}</arg>
<arg>--nameNode</arg><arg>${nameNode}</arg>
<!-- <arg>--sourcePath</arg><arg>${workingDir}/resultperfunder</arg>-->
cleanup, please
@ -126,0 +164,4 @@
final Consumer<ContextInfo> consumer = ci -> cInfoList.add(ci);
queryInformationSystem.getContextInformation(consumer);
//List<ResearchInitiative> riList = new ArrayList<>();
cleanup
done
@ -126,0 +169,4 @@
try {
writer.write(new Gson().toJson(Process.getEntity(cInfo)));
} catch (IOException e) {
e.printStackTrace();
why an exception risen here should not interrupt the execution?
because it is in a lambda expression. Anyway we can remove this test. It was just needed to verify that the file was written compressed.
I have left the test, and removed the lambda
@ -31,2 +31,3 @@
"paramRequired": true
}
},{
"paramName":"dt",
you can probably indent this json record in a more uniform way
@ -0,0 +23,4 @@
"paramDescription": "the name of the result table we are currently working on",
"paramRequired": true
}, {
"paramName":"rp",
you can probably indent this json record in a more uniform way
Overall, it looks pretty good, just some minor changes on
Requested changes done. There is also another change to check: two classes have been added to common to allow the mapping for the doiBoost result in the public format.
WIP: dump of the results related to at least one projectto dump of the results related to at least one project