add content gaps files
content_gaps.py
- add
get_wikidata_qitems
to extract all wikidata qitems - add
get_wikidata_properties
to extract all property-value pairs for all wikidata qitems - change
aggregate_item_property
to multiple functions. one function handles one property - change
get_wikipedia_revision_text
: remove redirect pages by joining mediawiki_page table
content_gaps_metrics.py
- show the flow of getting one big dataframe (not including article features yet)
- examples for computing metrics
- examples for plotting graph
article_features.py
- wrap multiple wikitext functions from Isaac's example code to one udf