Change the DAGs to run on /tmp/wmde in production instead of /wmf/tmp
Contributor checklist
-
I have written tests for this DAG that will be merged into data-engineering/airflow-dags/tests/wmde -
I have locally ran the above tests and code quality checks as outlined in the tests section of the Airflow DAGs project readme -
I have tested the jobs for this DAG in my local database using the process defined in wmde/analytics/hql/airflow-jobs/wd_item_sitelink_segments/_test_weekly -
I have tested the included DAGs in my local database using the process outlined in TEST_AIRFLOW_DAGS.md and the test variable files provided for each DAG -
All Hive tables and HDFS directories that are needed by the included DAG jobs have been created and are accessible by the analytics-wmde
Airflow user-
Hive wmde.wd_item_sitelink_segments_weekly
-
HDFS /wmf/data/published/datasets/wmde/analytics/wd_item_sitelink_segments_weekly
/wmf/data/published/datasets/wmde/analytics/wd_rest_api_metrics_monthly
-
Description
See #700 for the original work for these DAGs. Via a discussion on Slack, it was decided that tmp directory work for analytics-wmde
should be done in /tmp/wmde
rather than /wmf/tmp/wmde
for production and testing.
Test outputs
See #700 for test outputs.