This MR adds a python package (
differential_privacy) of pyspark
The MR adds a Gitlab CI pipeline for the repo (.gitlab-ci.yml). CI allows to
- Automatically run unit tests on push
- Automatically run linting (flake 8) on push
- Manually run a build job that produces a
conda-distarchive of dependencies, compatible with WMF airflow deployments.
This MR has been tested by running existing tmlt pipelines notebook using the conda environment published at https://gitlab.wikimedia.org/repos/security/differential-privacy/-/packages/158
The following will be tackled in follow up MRs.
-  try to build python-flint from source
-  try to reduce conda-dist size by remove pyspark deps (assuming avail on stat/airflow nodes)
-  simplify package management either by using pyproject and/or poetry. This might break compat with our internal tooling, and needs testing.