Stats about Wikimedia train deploys
A repo for creating model cards and datasheets for algorithms and datasets currently in production at WMF.
machine learning pipelines for research
A tutorial and talk about Gerrit for product analytics (yes, I see the irony of it being on GitLab)
Differentially private analytics with Apache Spark (demo)
A simple script to deploy and run spark3 on WMF stat machines
Collection of hacks to run DIY spark3 binaries on Analytics infra.
Image recommendation for unillustrated Wikipedia articles
Experiments with airflow dags and jobs running on WMF analytics infrastructure.
A JVMTI agent that attaches to your JVM and kills it when things go sideways
API for detecting and surfacting copyedits for Wikipedia articles
Research on copyedits as a Structured Taks for newcomers
Code for research on link recommendation for orphan articles