Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • I ImageMatching
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Custom issue tracker
    • Custom issue tracker
  • Merge requests 2
    • Merge requests 2
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Jobs
  • Commits
Collapse sidebar
  • repos
  • generated-data-platform
  • ImageMatching
  • Merge requests
  • !32

Packaging ImageMatching as a Python wheel

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Gmodena requested to merge refactor-packaging into main Sep 29, 2021
  • Overview 0
  • Commits 55
  • Pipelines 7
  • Changes 36

This MR is a refactoring of ImageMatching to make it more compliant with Python’s packaging tooling and practices (setuptools ).

The problem I’m trying to solve is to identify a boundary between upstream (research) code and how we use downstream in product features.

How to test

We can install algorunner in an env (e.g. stats machines) and launch the pyspark/papermill job with

$ (venv) pip install algorunner --extra-index-url https://gitlab.wikimedia.org/api/v4/projects/40/packages/pypi/simple
$ algorunner.py 2021-07-26 hywiki Output

Changes

  • The ImageMatching repo now contains only notebooks, an nbconverted script & papermill runners.
  • I moved all etl and test infra to the platfor-airflow-dags repo.
  • I created an ima package. Right now it contains notebooks. In the future it could host a library (that the notebooks can import).
  • setuptools (setup.py) is configured to package notebooks and scripts in a wheel, and installs them in PYTHONPATH (e.g. ./venv/lib/python3.7/site-packages/ima and ./venv/bin/ ). Scripts are also added to PATH.
  • CI builds and deploys the wheel to pypi (https://gitlab.wikimedia.org/gmodena/ImageMatching/-/pipelines/670).
Edited Oct 14, 2021 by Gmodena
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: refactor-packaging