Adding support for "produced_by" configuration of datasets
Implementation of the proposal outlined in https://phabricator.wikimedia.org/T372647
Airflow producer dataset annotation
NOTE: Missing implementation of automatic configuration of execution_delta
based on target DAG's schedule
Example:
produced_by:
airflow:
instance: search
dag_id: dummy_dag
task_group_id: dummy_grouped_tasks
- If
produced_by
configuration is present for anyDataset
implementation,get_sensor_for
returns a configured external task sensor - Depending on whether produced_by configuration refers to the Airflow instance the DAG code is running on, or not, producer
get_sensor_for
returns either the basic ExternalTaskSensor or RestExternalTaskSensor