Skip to content

Resolve "Update: Migrate to Spark3"

Appledora requested to merge 28-spark-upgrade into main

Create new environment using conda-analytics-clone [ENV_NAME] and then, initialize session like below :

import wmfdata as wmf
spark = wmf.spark.create_custom_session(
    master="yarn",
    spark_config={
        "spark.driver.memory": "2g",
        "spark.dynamicAllocation.maxExecutors": 64,
        "spark.executor.memory": "8g",
        "spark.executor.cores": 4,
        "spark.sql.shuffle.partitions": 256
    },
    ship_python_env=True,
)

no getOrCreate() function used like before, to create spark object

Closes #28 (closed)

Merge request reports