Production data etl
Created by: gmodena
Transform raw data to production (PoC) schema.
This PR adds a pyspark etl that generates PoC data from the notebook's raw output.
How to register an account on GitLab. To prevent spam, new accounts are locked until approved by an admin or the approver bot. You can also file an unlock request to expedite access.
Support: mw:GitLab, how to host a project on GitLab, #wikimedia-gitlab on libera.chat, #GitLab on Phabricator.
Created by: gmodena
Transform raw data to production (PoC) schema.
This PR adds a pyspark etl that generates PoC data from the notebook's raw output.