Creates an artifact, a packed conda environment, to be deployed across the data engineering cluster. Currently contains Pyspark3 and JupyterHub.
Experiments with PrometheusSink and PushGateway
Experimental configurable generic Flink pipelines
Differentially private analytics with Apache Spark (demo)
PoC for consuming revision and android interaction streams
This is a Work In Progress fork of https://github.com/banzaicloud/spark-metrics/ modified to meet WMF conventions.