Skip to content

Restructure output datasets

Fabian Kaelin requested to merge outputs into main

Generalize output formats, the metrics are generated for these four aggregation levels

  • metrics_by_category: metrics for content gap categories (e.g. female category of the gender gap)
  • metrics_by_content_gap: metrics for content gaps (e.g. across all categories)
  • metrics_by_category_all_wikis: metrics across all wikis per category
  • metrics_by_content_gap_all_wikis: metrics across all wikis per content gap

Additional changes

  • improved configuration, including adding a sub_content_gaps arg to select specific gaps to compute
  • updated validation notebook to use new output files
  • bump version to 0.3.0

Merge request reports