Don't fail to iterate dumps

When running scripts/detect_html_tables.py in the section-topics repo, we encountered an error that is caused by mwparserfromhtml failing to handle dumps output where a title may not be present (in an arwiki article)

mwparserfromhtml should handle such incomplete data more gracefully (e.g. skipping the article)

Stack trace from such run:

Traceback (most recent call last):
  File "/srv/home/mlitn/section-topics/scripts/detect_html_tables.py", line 146, in <module>
    main()
  File "/srv/home/mlitn/section-topics/scripts/detect_html_tables.py", line 139, in main
    df = spark.createDataFrame(dataset)
  File "/home/mlitn/.conda/envs/section_topics_for_dev/lib/python3.10/site-packages/pyspark/sql/session.py", line 675, in createDataFrame
    return self._create_dataframe(data, schema, samplingRatio, verifySchema)
  File "/home/mlitn/.conda/envs/section_topics_for_dev/lib/python3.10/site-packages/pyspark/sql/session.py", line 700, in _create_dataframe
    rdd, schema = self._createFromLocal(map(prepare, data), schema)
  File "/home/mlitn/.conda/envs/section_topics_for_dev/lib/python3.10/site-packages/pyspark/sql/session.py", line 509, in _createFromLocal
    data = list(data)
  File "/srv/home/mlitn/section-topics/scripts/detect_html_tables.py", line 41, in generate_dataset
    for article in html_dump:
  File "/home/mlitn/.conda/envs/section_topics_for_dev/lib/python3.10/site-packages/mwparserfromhtml/dump/dump.py", line 71, in read_dump_local
    yield Article(article)
  File "/home/mlitn/.conda/envs/section_topics_for_dev/lib/python3.10/site-packages/mwparserfromhtml/parse/article.py", line 24, in __init__
    self.title = self.parsed_html.title.text
AttributeError: 'NoneType' object has no attribute 'text'

Admin message

Admin message

Admin message

Don't fail to iterate dumps