Skip to content

Add a column for edit rollbacks

Jenniferwang requested to merge jiawang/data-pipelines:tsp_T371404_2 into main

The updates include:

  1. Adding a column for the edit rollbacks metric in trust_safety_admin_action_monthly table
  2. Making a few minor formatting adjustments

Test Approach

spark3-sql -f create_admin_action_monthly_table.hql \
       -d table_name=jiawang_airflow_test.trust_safety_admin_action_monthly \
        -d base_directory=/user/hive/warehouse/jiawang_airflow_test 
spark3-sql --master yarn --executor-memory 16G --executor-cores 8 --driver-memory 4G --conf spark.dynamicAllocation.maxExecutors=64 \
              -f generate_admin_action_monthly.hql \
         -d source_history_table=wmf.mediawiki_history \
         -d source_logging_table=wmf_raw.mediawiki_logging \
         -d destination_table=jiawang_airflow_test.trust_safety_admin_action_monthly \
         -d canonical_table=canonical_data.wikis \
         -d snapshot=2024-09 \
 	-d coalesce_partitions=1

Merge request reports