Entropy based uri normalization
- add transformations to use entropy based uri normalization based on https://gitlab.wikimedia.org/xiaoxiao/web_scraping/-/tree/experimental?ref_type=heads
- adds notebooks to explore the patterns used for uri normalization, and to tune the params. (the patterns.ipynb notebook will be removed before merging)