Create list of misspellings from enwiktionary per language
enwiktionary contains a template for misspellings (misspelling_of). We extract all entries in enwiktionary that are tagged with this template. For each misspelling, we want to capture:
- language
- misspelled word
- correctly spelled word
- Top-level section template appeared under – e.g., English
- number of subsections in that section – e.g., 2 (like
noun
andverb
)
The desired output will be a tsv-file capturing all found entries.