add content gaps files

Merged AikoChou requested to merge aiko/one_dataframe into main

content_gaps.py

  • add get_wikidata_qitems to extract all wikidata qitems
  • add get_wikidata_properties to extract all property-value pairs for all wikidata qitems
  • change aggregate_item_property to multiple functions. one function handles one property
  • change get_wikipedia_revision_text: remove redirect pages by joining mediawiki_page table

content_gaps_metrics.py

  • show the flow of getting one big dataframe (not including article features yet)
  • examples for computing metrics
  • examples for plotting graph

article_features.py

  • wrap multiple wikitext functions from Isaac's example code to one udf
Edited by AikoChou

Merge request reports