Skip to content

Draft: Import canonical_data.countries through Airflow

Created by: aquwikimedia

We would like to mutualize static data from analytics-refinery: https://github.com/wikimedia/analytics-refinery/blob/master/static_data/mediawiki/geoeditors/blacklist/country_codes.tsv with the data in country/coutries.tsv .

In this patch, you will find the HQL script to create the table and the Python script to be triggered by Airflow.

The airflow job is in this patch: repos/data-engineering/airflow-dags!428 (merged)

As we are going to use the countries data in this repo around analytics-refinery, it will perform the changes:

  • 6 countries were not allowed to be released and are going to be released by this change: Burundi, Equatorial Guinea, Lybia, Singapore, Somalia, Tajikistan
  • 5 countries were allowed to be released and are going to be disallowed by this change: Bangladesh, Honduras, Kuwait, Nicaragua, Oman

The mutualization job in analytics-refinery: https://gerrit.wikimedia.org/r/c/analytics/refinery/+/929723

Bug: T338033

Edited by Neil Shah-Quinn (WMF)

Merge request reports