Beta stats wf updated #332

Merged
claudio.atzori merged 7 commits from antonis.lempesis/dnet-hadoop:beta into beta 2023-10-10 09:35:33 +02:00
Collaborator

Changes in indicators step, monitor step:

  1. Table graduatedoctorates for observatory
  2. Table result_apc_affiliations
  3. New indicators
    • indi_is_funder_plan_s
    • indi_funder_fairness
    • indi_ris_fairness
    • indi_funder_openess
    • indi_ris_openess
    • indi_funder_findable
    • indi_ris_findable
    • indi_is_project_result_after
  4. Casting year to int for composite indicators
  5. New institutions
    • Universidade Católica Portuguesa
    • Iscte - Instituto Universitário de Lisboa
    • Munster Technological University
    • Cardiff University
    • Leibniz Institute of Ecological Urban and Regional Development
  6. New stats-monitor-update wf
    • executed when a new org needs to be added to monitor DBs
    • Parameters:
      • stats_db_name (eg openaire_beta_stats_20230822)
      • monitor_db_name (eg openaire_beta_stats_monitor_upd_20230822)
      • monitor_db_prod_name (eg. openaire_beta_stats_monitor)
      • monitor_db_shadow_name (eg. openaire_beta_stats_monitor_shadow)
      • hive_timeout=150000
      • hadoop_user_name (eg. dnet.beta)
      • resumeFrom
Changes in indicators step, monitor step: 1. Table graduatedoctorates for observatory 2. Table result_apc_affiliations 3. New indicators * indi_is_funder_plan_s * indi_funder_fairness * indi_ris_fairness * indi_funder_openess * indi_ris_openess * indi_funder_findable * indi_ris_findable * indi_is_project_result_after 4. Casting year to int for composite indicators 5. New institutions * Universidade Católica Portuguesa * Iscte - Instituto Universitário de Lisboa * Munster Technological University * Cardiff University * Leibniz Institute of Ecological Urban and Regional Development 6. New stats-monitor-update wf * executed when a new org needs to be added to monitor DBs * Parameters: * stats_db_name (eg openaire_beta_stats_20230822) * monitor_db_name (eg openaire_beta_stats_monitor_upd_20230822) * monitor_db_prod_name (eg. openaire_beta_stats_monitor) * monitor_db_shadow_name (eg. openaire_beta_stats_monitor_shadow) * hive_timeout=150000 * hadoop_user_name (eg. dnet.beta) * resumeFrom
dimitris.pierrakos added 3 commits 2023-09-01 10:34:35 +02:00
163b2ee2a8 Changes
1. Monitor updates
2. Bug fixes during copy to impala cluster
964c2f553e Changes in indicators step, monitor step
- graduatedoctorates for observatory
- result_apc_affiliations table
- new indicators
	indi_is_funder_plan_s
	indi_funder_fairness
	indi_ris_fairness
	indi_funder_openess
	indi_ris_openess
	indi_funder_findable
	indi_ris_findable
	indi_is_project_result_after
- cast year to int in composite indicators
- new institutions
     -- Universidade Católica Portuguesa
     -- Iscte - Instituto Universitário de Lisboa
     -- Munster Technological University
     -- Cardiff University
     -- Leibniz Institute of Ecological Urban and Regional Development
dimitris.pierrakos added 1 commit 2023-09-06 13:14:45 +02:00
dimitris.pierrakos added 1 commit 2023-09-19 13:25:47 +02:00
9ef971a146 Update step16-createIndicatorsTables.sql
Fix int year for:
indi_org_openess_year
indi_org_fairness_year
indi_org_findable_year

Looking at the changeset I notice a new workflow pops up: dhp-workflows/dhp-stats-monitor-update.

I assume it needs to be run in the context of the graph provisioning pipeline, but I haven't seen any operational requirement/specification about it yet. In case I must start to operate it, as a follow-up action I need to include it in the automatic workflow deployment procedures.

So, can you please update the PR description? Please include

  • when it needs to be run (i.e. right after the stats db update and before the stats publising workflow)
  • the set of mandatory / optional parameters
Looking at the changeset I notice a new workflow pops up: `dhp-workflows/dhp-stats-monitor-update`. I assume it needs to be run in the context of the graph provisioning pipeline, but I haven't seen any operational requirement/specification about it yet. In case I must start to operate it, as a follow-up action I need to include it in the automatic workflow deployment procedures. So, can you please update the PR description? Please include * when it needs to be run (i.e. right after the stats db update and before the stats publising workflow) * the set of mandatory / optional parameters
dimitris.pierrakos added 1 commit 2023-10-09 13:00:57 +02:00
489a082f04 Update step16-createIndicatorsTables.sql
Change scripts for gold, hybrid, bronze indicators
dimitris.pierrakos added 1 commit 2023-10-09 13:21:36 +02:00
17586f0ff8 Update step20-createMonitorDB.sql
Add result_orcid table to monitor dbs
claudio.atzori merged commit 4e6fccf4f6 into beta 2023-10-10 09:35:33 +02:00

After discussing with @dimitris.pierrakos, the module implementing the stats-monitor-update workflow will be proposed in a future PR, so for the time being it was removed.

After discussing with @dimitris.pierrakos, the module implementing the `stats-monitor-update` workflow will be proposed in a future PR, so for the time being it was removed.
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: D-Net/dnet-hadoop#332
No description provided.