Skip to content

Pull weightedTags out of rawFields on transcoder decode

Ebernhardson requested to merge work/ebernhardson/transcoder-weighted-tags into main

When the transcoder encodes an event it moves the weighted tags from a specialized class property into the generic fields Row instance. On loading it was copying the weighted tags back into the class property, but also left the tags in the fields Row. When attempting a second transcode after a fetch failure the encoder failed as we explicitly do not merge weighted tags at this stage and it has two sets of tags.

Primary fix is to remove the weighted tags from the fields Row after decoding. Combined with a check that weighted tags are not part of the raw fields we build a guarantee that weighted tags related values will be the same before and after transcoding.

While updating this I found that the test case comparing the round tripped events was incorrect, because the transcoding process mutated values inside the source update. Row.copy was used to ensure we are using new Row objects and not mutating the source events. Flink doesn't require this of us, but it seems easier to reason about.

Merge request reports