Skip to content

Automate usage of fsspec hdfs URLs via new pyarrow HDFS API

Ottomata requested to merge fsspec_pyarrow into main
  • fsspec_use_new_pyarrow_api - call this to make fsspec always use new pyarrow API with all hdfs:// URLs. This is only needed until https://github.com/fsspec/filesystem_spec/issues/874 is resolved.

  • set_hadoop_env_vars - sets needed env vars to work with new pyarrow HDFS API. This is also called by fsspec_use_new_pyarrow_api() by default.

https://phabricator.wikimedia.org/T300876

Merge request reports