Init catalog tables with a kafka watermark.
Bug: https://phabricator.wikimedia.org/T324144
This MR initialises TableDescriptor
s with Kafka watermarks.
The change adds a kafka_timestamp
virtual column to schemas accessed
through the catalog. kafka_timestamp
is a watermark field with a delay of 10 seconds. Both settings are eventutilities defaults. We might want to tune this in the future (or expose a property to let users tune column name and delay) but for now they seem a sensible starting point.
kafka_timestamp
carries "event time" Flink semantics (rowtime
) as described in https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/concepts/time_attributes/.
An example of tubmle window query on a watermaked table created with this patch can be found at: https://gitlab.wikimedia.org/-/snippets/47/edit.
Note that while the virtual column can be projected from the schema, it will not be persisted with INSERT
statements, thus not affecting existing sink
functionality.
See https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/ for details.