Explore GitLab
Discover projects, groups and snippets. Share your projects with others
-
A documentation for non-technical users of WikiFunctions
-
GitLab settings and tooling for applying them via the API.
-
Utilities and libraries for working with data pipelines at WMF. E.g. distributing conda envs, building and syncing artifacts, etc.
-
Creates an artifact, a packed conda environment, to be deployed across the data engineering cluster. Currently contains Pyspark3.
-
Collection of data engineering DAGs to be executed by the WMF Airflow instances.
-
A documentation for non-technical users of WikiFunctions
-
This is a simple tool to merge various, externally-hosted semgrep rules and policies for packaging and consumption by a semgrep cli. This tool is largely needed as an alternative to the semgrep.dev//r rules/policies repository.
-
Stats about Wikimedia train deploys
-
This is a repository for various gitlab ci-related security templates. The design of this repository is an a la carte selection of different security tools which users can add to their various repositories as needed.
-
-
Data jobs owned by the Global Data and Insights team.
These should use conda-dist from workflow_utils to generate conda env artifacts, which are then deployed to the Analytics Data Lake for scheduling and running by Airflow.
-
Data jobs owned by the Global Data and Insights team.
These should use conda-dist from workflow_utils to generate conda env artifacts, which are then deployed to the Analytics Data Lake for scheduling and running by Airflow.
-
Quickly spins up a local Dockerized DDD / Datasette instance, from scratch.
-
Ansible playbook; builds Elasticsearch plugins pkg for WMF
-
A simple script to deploy and run spark3 on WMF stat machines
-
test moving of quarry
-
a python library for working with the HTML dumps
-
A fork of MediaWiki's BoilerPlate template to show example security CI template setup.