Eventutilities Python merge requestshttps://gitlab.wikimedia.org/repos/data-engineering/eventutilities-python/-/merge_requests2024-01-03T17:37:35Zhttps://gitlab.wikimedia.org/repos/data-engineering/eventutilities-python/-/merge_requests/85Finish conversions for pyflink and python native types2024-01-03T17:37:35ZOttomataaotto@wikimedia.orgFinish conversions for pyflink and python native typesWe previously only implemented conversion between PyFlink Rows
and python dicts. We'd prefer not to expose any of the PyFlink
types if we can help it. This change implements:
- Conversion of PyFlink Instants to python datetimes
- Conv...We previously only implemented conversion between PyFlink Rows
and python dicts. We'd prefer not to expose any of the PyFlink
types if we can help it. This change implements:
- Conversion of PyFlink Instants to python datetimes
- Conversion of any container type (Row, Array, Tuple) that might
contain a Row or datetime
Bug: T349640https://gitlab.wikimedia.org/repos/data-engineering/eventutilities-python/-/merge_requests/83Draft: manager: store proecessed events in a list state2023-10-18T15:36:44ZGmodenaDraft: manager: store proecessed events in a list state_Marking as Draft because I'd like to validate if this approach makes sense_
Keep a list of events that have already been processed in the current window,
to mitigate duplicate event processing (e.g. calls to api) in case of application..._Marking as Draft because I'd like to validate if this approach makes sense_
Keep a list of events that have already been processed in the current window,
to mitigate duplicate event processing (e.g. calls to api) in case of application
restarts. This is guaranteed within a single window firing
and not between windows firing: when resuming from a checkpoint (e.g. recovery) source offsets
(and list state) will rewind to the latest known checkpoint.
The list state is check pointed consistently by the system as part of the distributed snapshots.
Bug: T347282
cc / @dcausse @otto