CX Translators daily DAG
Bug: T382706
- Fetches the
cx_translatorsdata (a MariaDB extension table) and loads towmf_product.cx_translatorsin Data Lake. - The data is intended to be reused by other pipeline to calculate aggregate metrics.
- The DAG uses a Python job repo (script) to fetch from MariaDB extension replicas and load to Data Lake.
Testing results
Verified the results using:
sudo -u analytics-privatedata spark3-sql -e "select * from ``kcvelaga.af``_test_cx_translators limit 5"
sudo -u analytics-privatedata spark3-sql -e "select count(DISTINCT translator_user_id) from kcvelaga.af_test_cx_translators"
Spark master: local[*], Application Id: local-1746605135187
count(DISTINCT translator_user_id)
202283
Time taken: 4.498 seconds, Fetched 1 row(s)
