Skip to content

T349512 add notebooks and html files to document query sampling

Andrew McAllister (WMDE) requested to merge T349512-wdqs-sampling into main

Related task: T349512

Main edits:

  • Adds notebooks that have all outputs removed
  • Adds HTML versions of the notebooks that have PySpark outputs removed
  • Adds .py utils file with needed functions
  • Adds a readme for the subdirectory to explain the project

Other edits:

  • Adds a TOC to the main directory readme (by all means remove if it's not helpful!)
  • Adds a .gitignore file that ignores .ipynb_checkpoints

Merge request reports