-
Aqu authored
* Bump refinery source for Refine to Hive hourly DAG * Fix tests following change in app name * Cleanup config store mock * Revert change on Yarn Spark app name * Cleaner code * Rename RefineConfiguration class * Remove commented lines * Linting * Removing type test remanence * Remove deprecation warning * Remove deprecation * More pyspark-exension on HDFS * Set a static name to Refine Spark job for Datahub * Add comments about Spark conf job target * Rename xcom key var to store Refine job params * Linting * Fixes following review * Convert evolve table to SparkSubmitOperator * Rename a method * Clean up * Remove Refine to Iceberg (keep it for later) * Unclutter the test fixture * Better loggin in Refine to hive hourly * Rename a variable * Linting * Customize yarn app names in Refine * Remove useless mocker contextManager * Refine eventlogging_NavigationTiming in test cluster * Add stating warning about new Refine dag in production * Linting * Add meaningful task index to canary events dag * Fix catchup in delayed hourly timetable * Fix ordering in diff script * Add templated names for Refine tasks * Propagate email for variables to the Hive refine factory * Fix plugin initialization * Revert "Proposition to fix plugins initialization" This reverts commit f66bf1b3. * Proposition to fix plugins initialization * Add failing test about DAG serialization with DelayedHourlyTimetable * Linting * Add 2 hours delay for Refine to Hive DAG * Order by more column before diffing DFs As meta.dt may not be enough (truncated to ms). * Fix call following the jar removal from the files param * Bundle the jar into the archive To avoid putting the jar both in the --jar param and in the --files param. * Update spark logger in output_diff * Switch to output_diff script to python logger * Perform diffing on alphabetically rearranged DFs * Increase Refine driver memory size for small jobs * Update evolve table parameter * Update RefineHiveDataset module path * Fix configuration * Update RefineHiveDataset CLI params following refactoring in scala code * Linting * Add refine dag for analytics_test with a factory * mypy fix * linting fix * Fix unit tests * Force skein launch of pyspark app * Add map_index to yarn app name * Add logger to output_diff * Adjustment with refinery-source * Output mypy version in Gitlag CI * Formatting * black check * Isort * Read refine conf from ESC + Add output diff * Evolve and refine Hive & Iceberg tables * Add auto evolve Iceberg tables in Refine Iceberg DAG. * Add a new DAG to Refine to Hive tables - evolve Hive tables according to json schemas - Refine to Hive tables from HDFS Gobblin output Bug: T356762
7357f779