Generalize revert risk pipelines

  • adds a ModelVariant type to generalize model inference for the various revert risk models
  • add transformation to run batch inference for all revert risk models simultaneously
  • detangle columns used for "features", json vs maps vs pyspark.ml.Vector

Caveat: the production multi-lingual revert risk model is from 2023 and serialized with outdated versions of pytorch/transformers. To use the generalized "produce predictions for all models" pipeline, the ML model needs to be retrained (as it should be for quality reasons anyways).

Merge request reports

Loading