Use a `URLSensor` to wait for the whole wikitext snapshot (!425) · Merge requests · repos / data-engineering / Airflow DAGs

Marco Fossati requested to merge section-alignment-wikitext-sensor-fix into main Jun 08, 2023

The default behavior of Spark tasks is to process all Wikipedias. Hence, a NamedHivePartitionSensor won't work if we wait for the whole wikitext monthly snapshot, while it's fine if specific Wikipedias are passed.

Separate the wikitext sensor building logic accordingly
set the poke interval of all sensors to 1 hour, as we do section topics and image suggestions DAGs

Admin message

Admin message

Use a `URLSensor` to wait for the whole wikitext snapshot

Merge request reports