Resolve "Update: Migrate to Spark3"
Create new environment using conda-analytics-clone [ENV_NAME]
and then, initialize session like below :
import wmfdata as wmf
spark = wmf.spark.create_custom_session(
master="yarn",
spark_config={
"spark.driver.memory": "2g",
"spark.dynamicAllocation.maxExecutors": 64,
"spark.executor.memory": "8g",
"spark.executor.cores": 4,
"spark.sql.shuffle.partitions": 256
},
ship_python_env=True,
)
no getOrCreate() function used like before, to create spark object
Closes #28 (closed)