Article quality score pipeline
- article quality model https://github.com/geohci/miscellaneous-wikimedia/blob/master/article-features/article_data.ipynb
-
v2 version https://github.com/geohci/miscellaneous-wikimedia/blob/master/article-features/quality_model_features_V2.ipynb -
move code from notebooks into a python package -
if applicable, move common code to to research-ml repository and add a dependency from this repo. Action: no-op. There is not enough sharable code in the repo. Once we work on multiple other projects, we can then identify sharable code and move it to research-ml. -
function that returns spark dataframe of article quality scores to join against, for use in notebooks. Tracked in #4 -
is it possible to generate article quality over time? Answer: Yes, running the code at different times will result in different quality scores as the features such as the number of images or references change over time.
Edited by Bmansurov