T369117 DAGs for distinct and by-wiki Wikidata coeditors
Contributor checklist
-
I have written tests for this DAG that will be merged into data-engineering/airflow-dags/tests/wmde -
I have locally ran the above tests and code quality checks as outlined in the tests section of the Airflow DAGs project readme -
I have tested the jobs for this DAG in my local database using the process defined in wmde/analytics/hql/airflow_jobs/wd_coeditors/_test_monthly -
I have tested the included DAGs in my local database using the process outlined in TEST_AIRFLOW_DAGS.md and the test variable files provided for each DAG -
All Hive tables that are needed by the included DAG jobs have been created and are accessible by the analytics-wmde
Airflow userwmde.tmp_mw_history_editor_activity_flags_by_wiki
wmde.wd_coeditors_distinct_monthly
wmde.wd_coeditors_by_wiki_monthly
Description
- T369117: Creating a metrics DAG pipeline that counts distinct Wikidata coedtors across all wikis and also by wiki. The jobs for the respective DAGs can be found in wmde/analytics/hql/airflow_jobs/wd_coeditors.
- Within this we break down the user to casual, active and very active users and also split them based on whether they're editing content or non-content pages.
Test outputs
wd_coeditors_distinct_monthly
month | total_casual_wd_coeditors | total_casual_content_wd_coeditors | total_casual_non_content_wd_coeditors | total_active_wd_coeditors | total_active_content_wd_coeditors | total_active_non_content_wd_coeditors | total_very_active_wd_coeditors | total_very_active_content_wd_coeditors | total_very_active_non_content_wd_coeditors |
---|---|---|---|---|---|---|---|---|---|
2024-10-01 | 16275 | 15883 | 9046 | 5542 | 5458 | 301 | 2047 | 1905 | <25 |
wd_coeditors_by_wiki_monthly
month | wiki | total_casual_wd_coeditors | total_casual_content_wd_coeditors | total_casual_non_content_wd_coeditors | total_active_wd_coeditors | total_active_content_wd_coeditors | total_active_non_content_wd_coeditors | total_very_active_wd_coeditors | total_very_active_content_wd_coeditors | total_very_active_non_content_wd_coeditors |
---|---|---|---|---|---|---|---|---|---|---|
2024-10-01 | commonswiki | 4510 | 4477 | 2637 | 2049 | 1822 | 123 | 786 | 679 | <25 |
2024-10-01 | enwiki | 5726 | 5619 | 2625 | 1440 | 1373 | 77 | 275 | 236 | <25 |
2024-10-01 | frwiki | 1750 | 1719 | 813 | 443 | 427 | 29 | 146 | 134 | <25 |
2024-10-01 | eswiki | 1497 | 1460 | 637 | 399 | 401 | <25 | 106 | 97 | <25 |
2024-10-01 | dewiki | 1438 | 1413 | 580 | 327 | 340 | 26 | 117 | 106 | <25 |
Edited by Andrew McAllister (WMDE)