Skip to content

Don't retry convert_history_xml_to_parquet.

Xcollazo requested to merge dont-retry-mediawiki-wikitext-history into main

We have found multiple instances of Airflow thinking that the MediawikiXMLDumpsConverter job has failed and thus it has retried. Unfortunately the underlying Spark job continues and we get ourselves into duplicated data issues.

We'd rather have a failed job than bad data, so in this commit we set retries=0.

Bug: T342911

Merge request reports