This is a Work In Progress fork of https://github.com/banzaicloud/spark-metrics/ modified to meet WMF conventions.
PoC for consuming revision and android interaction streams
Differentially private analytics with Apache Spark (demo)
Experimental configurable generic Flink pipelines
Experiments with PrometheusSink and PushGateway
Creates an artifact, a packed conda environment, to be deployed across the data engineering cluster. Currently contains Pyspark3 and JupyterHub.