bot-detection pipeline fix

When fetching release artifact through api v4 endpoint an extra directory is added wihch names includes the commit sha1. It currently forced us to include it in the path. e.g:

{
  "refinery_private_archive": "hdfs:///wmf/cache/artifacts/airflow/analytics/refinery-private-v0.0.1.zip#refinery-private",
  "compute_actor_label_hql": "refinery-private/refinery-private-v0.0.1-f8c4e0aeea8d6e8fb6b5ba769a41482abc4407e8/hql/webrequest/actor/compute_webrequest_actor_label_hourly.hql",
}

Now I've added a repackaging step within refinery-private: https://gitlab.wikimedia.org/repos/data-engineering/refinery-private/-/blob/main/scripts/repack-archive.sh?ref_type=heads And the new archive does not contain a variable-named directory.

Also it reverts previous revert around replacing archives by skein_container_files which is makes more sense here as the hql only needs to be unpacked from zip once, in driver.

➡️ Delete the 3 matching Variables when deploying.

Bug: T415874

Edited by Aqu

Merge request reports

Loading