Skip to content

search: specifiy analytics-hadoop in urls

Ebernhardson requested to merge work/ebernhardson/hdfs-default-cluster-url into main

While both spark and the typical hdfs tooling accept the hdfs:/// shorthand to request the default cluster, pyarrow refuses to use those urls. Recently, in discolytics 0.18.0, convert_to_esbulk.py was changed to use pyarrow so now we need to update these urls to use the default cluster.

This should have no downsides. Using the default cluster wasn't doing much of anything, we operate on a single cluster and the name has never historically changed.

Merge request reports