Skip to content

spark3 / gitlab migration

Ebernhardson requested to merge work/ebernhardson/conda-spark3 into main
  • Move requirements handling to conda to work with the data-engineering workflows for deploying python in analytics.

  • Update version number to 2.0.0.dev, major version bump due to significant changes in how the project is built (but not the underlying functionality)

  • Updates pyspark from 2.1.0 to 3.1.2.

  • Keeps xgboost at 0.90 for now which limits supported python version to <=3.7.

  • Removed custom mypy stubs, they don't seem worth maintaining

  • Updates elasticsearch to 7.10.1, matching elasticsearch deployed in production and gaining library provided mypy types.

  • Minor updates to match code with mypy analysis of updated libraries and types. Mostly things like converting a generator to a list, updating ltr elasticsearch client for the new headers parameter passed by @query_params, and using queue.Queue generics directly.

  • Update SCM information in pom.xml, along with updating scala and spark dependencies for spark 3.1.2.

Change-Id: I16c9550091e22eacccec8f2f215472b200b79966

Merge request reports