Skip to content

Refactor the fetch component and various fixes

DCausse requested to merge fetch-refactor-and-various-cleanup into main

Mostly some prep work for adding support for page rerenders. Since this branch started to get big I preferred to push some parts early, and also because we are not yet sure about what strategy we're going to use regarding kafka usage.

This branch contains:

  • refactor the fetch component to make it re-usable in other places
  • add minimal support for page_rerenders (only internal components, stream is not connected)
    • Add InputConverters support
    • Add merge support (for now we might only deduplicate them, merging them might be tricky)
  • change the way we pass events to the fetcher to avoid buffering too many events in the reordering and merge window
  • use Row to store the updated fields instead of a json byte array
  • disable generic types to ensure that we never rely on kryo or default java serialization
    • had to introduce a FetchResult POJO in addition to RetryContext
    • had to give-up on some object construction constrains (stop using final fields, have a default constructor and stop using Set in the data model) to make them flink PojoTypeInfo compatible
  • Introduce a small CirrusDocJsonHelper class to manipulate the cirrus doc json API response

Merge request reports