Add DAGs for video metric aggregations + update test var JSONs
Contributor checklist
-
I have written tests for this DAG that will be merged into data-engineering/airflow-dags/tests/wmde -
I have locally ran the above tests and code quality checks as outlined in the tests section of the Airflow DAGs project readme -
I have tested the jobs for this DAG in my local database using the processes defined in the following directories: -
I have tested the included DAGs using the process outlined in TEST_AIRFLOW_DAGS.md and the test variable files provided for each DAG -
All Hive tables that are needed by the included DAG jobs have been created and are accessible by the analytics-wmdeAirflow user -
All changes from the mainbranch have been rebased into this branch
Description
- Tentatively T198628 (T386916 was merged in): The Wiki Loves Broadcast project has been looking for metrics on video plays for some time. These DAGs and the corresponding Hive queries attempt to solve this as well as possible within the context of the current data stack.
A summary is:
- We'd like monthly metrics of video views
- As of now we have just been providing metrics on video thumbnails being loaded/entering the viewport
- There are routinely multiple fires of a video play for different quality levels each time the user clicks the thumbnail
- The best possible solution with the current data available was to count unique video plays on a daily basis broken down by user agent and ip
- These daily aggregates for videos within a category links category are then aggregated for "monthly unique viewers on a daily basis"
- This allows for a number that will more closely reflect a monthly view count by allowing for a user agent x ip combination to be counted twice only if the view is on different day
DAGs and destination tables:
- DAG ID:
wlb_commons_video_metrics_daily - Destination:
wmde.wlb_commons_video_metrics_daily - DAG ID:
wlb_commons_video_metrics_monthly - Destination:
wmde.wlb_commons_video_metrics_monthly
Test outputs
Destination table summary
andrewtavis_wmde.wlb_commons_video_metrics_monthly
| month | video_filename | video_category | sum_daily_unique_viewers |
|---|---|---|---|
| 2025-04 | string | string | bigint |

