Explore GitLab
Discover projects, groups and snippets. Share your projects with others
-
Data jobs owned by the Global Data and Insights team.
These should use conda-dist from workflow_utils to generate conda env artifacts, which are then deployed to the Analytics Data Lake for scheduling and running by Airflow.
-
A documentation for non-technical users of WikiFunctions
-
Ansible playbook; builds Elasticsearch plugins pkg for WMF
-
A documentation for non-technical users of WikiFunctions
-
Throw-away project for testing out using mediawiki-docker-make as a git submodule
-
Collection of data engineering DAGs to be executed by the WMF Airflow instances.
-
This is a simple tool to merge various, externally-hosted semgrep rules and policies for packaging and consumption by a semgrep cli. This tool is largely needed as an alternative to the semgrep.dev//r rules/policies repository.
-
Web2Cit integration into Wikipedia's visual editor.
Adds the citation returned by Web2Cit along with the citation returned by Citoid in Wikipedia's automatic citation generator.
-
-
Jupyter Notebook for https://phabricator.wikimedia.org/T301128
-
test moving of quarry
-
Based on "Language-agnostic quality model for articles" we compute the predicted quality for all history of revisions, measuring the changes on quality across time
-
a python library for working with the HTML dumps
-
-
Creates an artifact, a packed conda environment, to be deployed across the data engineering cluster. Currently contains Pyspark3.
-
Wikimedia mobile apps data and analytics
-
Scripts and tools for testing the SimilarEditors extension