producer: Split private events into their own stream
We currently allow the consumer to attempt to update all indexes on all clusters, but some of those clusters should only have private wiki indexes. This happens to work currently because we disallow automatic index creation, but that seems like a weak guarantee. Split events into two output topics, one for public and one for private, similar to how we do in mediawiki, to provide a stronger guarantee.
On the producer side, adds a step prior to sinking the events that splits them into public/private streams by looking for their wikiId in sitematrix. This component refreshes the sitematrix when it sees an unknown wiki at most once per minute. If a wiki cannot be conclusively identified as either public or private it defaults to private, as that seems to be the safest option. This will mean cloudelastic has the opportunity to miss the first few events of a new wiki (depending on order of operations between creating the wiki and adding to sitematrix).
On the consumer side adds a new config flag, consume-private-updates, which tells the consumer to read from both streams. It defaults to false, it has to be explicitly enabled. When enabled the private events are unioned to the public ones at the start of the streaming operation. This new flag also replaces the --saneitize-private-wikis flag, it seems appropriate to always saneitize private wikis when they are enabled.
Bug: T374335