Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • W WMF Data Workflow Utils
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Jobs
  • Commits
Collapse sidebar
  • Razzi
  • WMF Data Workflow Utils
  • Repository
Switch branch/tag
  • workflow_utils
  • CHANGELOG.md
Find file BlameHistoryPermalink
  • Ottomata's avatar
    Automate usage of fsspec hdfs URLs via new pyarrow HDFS API · e09ca0a4
    Ottomata authored Mar 01, 2022
    - fsspec_use_new_pyarrow_api - call this to make fsspec always use
      new pyarrow API with all hdfs:// URLs.
      This is only needed until
      https://github.com/fsspec/filesystem_spec/issues/874 is resolved.
    
    - set_hadoop_env_vars - sets needed env vars to work with new pyarrow HDFS API.
      This is also called by fsspec_use_new_pyarrow_api() by default.
    
    https://phabricator.wikimedia.org/T300876
    e09ca0a4