Packaging: set up repo for packaging
This might feel pre-emptive but I think this will make a lot of things a lot easier for us in the long-term.
Motivation for prioritizing:
- Loading the abbreviation JSON: we'll want to use the importlib.resources built-in. It seems to be the go-to for handling data files within Python packages (which gets complicated) where file paths get very complicated. They recommend
importlib.resources.files()
as the function to use but that's pretty recent (3.9) so I think we should go withimportlib.resources.open_text()
for backwards-compatibility to python3.7. The guide for the library used to provide this functionality to python3.6 and earlier (we can ignore that) has a good explanation of how it works. By making the repo a package, we can use the import language to load in the files consistently. - Running tests: right now we don't have a good non-hacky way of running tests. Because the package is nested under a
src
folder, running pytest won't automatically discover it and pytest recommends doing a local install of the package for testing viapip install --editable .
. So packaging means we can then start writing and running tests.
What we'd want to do:
- Mainly this is about creating a reasonable setup.py like with mwparserfromhtml
- We could remove
src/__init__.py
which I don't think would be necessary for this. - We should then update the relative pathnames to absolute -- e.g.,
from .utils import capture_trailing_space
->from wikinlptools.utils import capture_trailing_space