Skip to content

Implement ndjson event source

Ebernhardson requested to merge work/ebernhardson/ndjson-source into main

We have a need to generate a set of rerender events from the index dumps stored in hadoop. Ingesting ndjson formatted text files seems like a reasonable way to convey the set of pages to be rerendered from that system to this script.

As this rerender will be a few hundred thousand pages we also introduce rate limiting. Default limits will spread a hundred thousand rerenders over ~17 minutes, helping to ensure we don't backlog the updater while these rerenders are processed.

Bug: T372446

Merge request reports