[ENRICHMENT][PROD] Use of community API in enrichment process AND addition to tagging result for communities through projects #361
Closed
claudio.atzori
wants to merge 0 commits from
propagationapi
into master
pull from: propagationapi
merge into: D-Net:master
D-Net:master
D-Net:beta
D-Net:rest-collector-plugin-with-retry
D-Net:beta_provision_relation
D-Net:irish-oaipmh-exporter
D-Net:spark34-integration
D-Net:dependency-revision
D-Net:beta-release-1.2.5
D-Net:misc_fixes_merge_entities
D-Net:WebCrowlBeta
D-Net:WebCrowl
D-Net:provision_memoryOverhead
D-Net:stats_step16_fix
D-Net:doidoost_dismiss
D-Net:CrossrefFundersMap
D-Net:taggingProjects
D-Net:9647_datacite_affiliations
D-Net:UsageStatsRecordDS
D-Net:mergeutils
D-Net:oaf_country_beta
D-Net:index_records
D-Net:ocnew
D-Net:FOSNew
D-Net:bulkTaggingPathMapExtention
D-Net:transformativeagreement
D-Net:new_orcid_enhancement
D-Net:9559_DBLP_data
D-Net:base_stats_job_deprecated
D-Net:SWH_issue_377
D-Net:import_orps_fix
D-Net:spark_join_param_tuning
D-Net:crossref_mapping_vocabularies
D-Net:promote_actions_join_type_master
D-Net:promote_actions_join_type
D-Net:provision_community_api
D-Net:enrichmentSingleStepFixed
D-Net:fosPreparationBeta
D-Net:resource_types
D-Net:enrichmentSingleStep
D-Net:oldPropagationOrganizationCommunity
D-Net:beta_to_master_dicember2023
D-Net:orcid_import
D-Net:9078_xml_records_irish_tender
D-Net:clean_license_publisher
D-Net:bulkTag
D-Net:SWH_integration
D-Net:importpoci
D-Net:8172_impact_indicators_workflow
D-Net:dedup-with-dataframe-spark34
D-Net:8876
D-Net:master_july23
D-Net:distinct_pids_from_openorgs_beta
D-Net:propagationProjectThroughParentChils
D-Net:fulltext_url_validation
D-Net:removeTaggingCondition
D-Net:ticket_8369
D-Net:tweaking_spark_params
D-Net:fc4e-rsac
D-Net:doiboost_authormerger
D-Net:beta_dedup_configuration
D-Net:apc_affiliation
D-Net:bulkTagRefactor
D-Net:organizationToRepresentative
D-Net:graph_cleaning_refactoring
D-Net:scholix_flat_indexing
D-Net:scholix_data_type_openaire
D-Net:advConstraintsInBeta
D-Net:doiboostMappingExtention
D-Net:mag_citation_relation
D-Net:h2020classification
D-Net:doiboostFunderExtention
D-Net:citations_monodirectional
D-Net:compatibility_order
D-Net:8232-mdstore-synch-improve
D-Net:subjectPropagation
D-Net:pubmed_to_production
D-Net:cleanCountryOnMaster
D-Net:graph_cleaning
D-Net:deduptesting
D-Net:horizontalConstraints
D-Net:enrichment
D-Net:scholix_to_solr
D-Net:transformation_wf
D-Net:discard-non-wellformed
D-Net:removeDump
D-Net:eosc_context_tagging
D-Net:pubmed_update
D-Net:doiboost_refactor
D-Net:clean_context_master
D-Net:monitoring
D-Net:dump_new_funded_products
D-Net:dump_delta_projects
D-Net:dump
D-Net:7096-fileGZip-collector-plugin
D-Net:oaf_relation_mapping
D-Net:validation
D-Net:native_records_migration
D-Net:instance_group_by_url
D-Net:hostedByMap_update
D-Net:hostedByMap_oastartdate
D-Net:sygma_indexing
No reviewers
Labels
Clear labels
Something is not working
This issue or pull request already exists
New feature / refactoring
Need some help
Something is wrong
More information is needed
EOSC Research Discovery Graph
EOSC Research Software APIs and Connectors
This won't be fixed
bug
Something is not working
duplicate
This issue or pull request already exists
enhancement
New feature / refactoring
help wanted
Need some help
invalid
Something is wrong
question
More information is needed
RDGraph
EOSC Research Discovery Graph
RSAC
EOSC Research Software APIs and Connectors
wontfix
This won't be fixed
No Label
bug
duplicate
enhancement
help wanted
invalid
question
RDGraph
RSAC
wontfix
Milestone
Clear milestone
No items
No Milestone
Projects
Clear projects
No project
Assignees
Clear assignees
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.
No due date set.
Dependencies
No dependencies set.
Reference: D-Net/dnet-hadoop#361
Reference in New Issue
No description provided.
Delete Branch "propagationapi"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
In this PR the use of the IS is replaced by the community APIs in the bulktagging and resulttocommunityfromorganization modules. The code is also refactored.
There is also the addition of a new module resulttocommunityfromproject that associated the result to the community if the result is linked to a project relevant for the community. This will replace the IIS document_referencedProjects in the graph processing. We need to add a new step to the workflow (it could be added at any point in the enrichment pipeline) The parameters are:
sourcePath = the source path
outputPath = the output path
production = true/false If true it will query the communityapis for production, if false the ones for beta
Note: the production parameter has to be added also to the bulktag and resulttocommunityfromorganization steps with the same semantics.
It turned out the branch
propagationapi
includes more changes that should not land to master yet.Pull request closed