Wait longer for wikitext history for dumps backfill.
Currently, when the set of DAGs for Dumps 2.0 fail, they do not send a message to data-engineering-alerts
, since this code is still in progress. Instead, they send an email directly to @xcollazo (example).
I received the following messsage:
Airflow alert: <TaskInstance: dumps_merge_backfill_to_wikitext_raw.wait_for_data_in_mw_wikitext_history scheduled__2024-03-01T00:00:00+00:00 [failed]>
Try 1 out of 6 Exception: Sensor has timed out; run duration of 606100.482329 seconds exceeds the specified timeout of 604800.0.
This timed out because mediawiki_wikitext_history
, which we use for backfilling, takes a long while to be populated for the month.
In this MR, we fix the sensor to wait up to 24 days, just like in the DAG that generates mediawiki_wikitext_history
.