Skip to content

Import fetch_cirrussearch_namespace_map

Ebernhardson requested to merge work/ebernhardson/namespace-map into main

Brings script over from the wikimedia/discovery/analytics repo in gerrit, chosen as a simple script that uses spark and some additional dependencies (requests). As part of this the wmf_spark.py script was brought over, renamed to discolytics.hive as it contains almost entirely methods related to reading or writing hive. The limit_top_n function was removed as it's unrelated to hive, it can be brought in as appropriate when needed.

Merge request reports