- 16 Mar, 2021 1 commit
-
-
Gmodena authored
* Add script to generate and export production datasets * Move hql script to ddl * Document publish.sh * Add some crude metrics reporting * Store artifacts and metrics by run identifier * Fix variable names * Adjust var names, record timestamps in metrics * Enable dynamic partitioning * Add snapshot partition to production dataset * Fix dir name * Update publish.sh doc * Make virtual env before activationg * Fix: confidence_rating to source mapping * Add export data summary * Update validation notebook with regression cases * Add test for confidence mapping * Fix. call uuid4 for default dataset_id * Fix missing coma in column list * Export NULL values as empty strings. * Genedate data for all languages * Update data export changelog * Update data export changelog: set month to March * Clean up validation notebook * Load validation data from hive * Fix character escaping
-
- 04 Mar, 2021 3 commits
-
-
Miriam Redi authored
T275685 automate pytest
-
Miriam Redi authored
T275162 enable spark metrics collection
-
Gmodena authored
-
- 02 Mar, 2021 9 commits
- 01 Mar, 2021 2 commits
- 26 Feb, 2021 2 commits
- 24 Feb, 2021 1 commit
-
-
Gabriele Modena authored
-
- 23 Feb, 2021 6 commits
-
-
Gabriele Modena authored
-
Gmodena authored
-
Gmodena authored
-
Gmodena authored
-
Gmodena authored
-
Miriam Redi authored
T274798 include all unillustrated articles
-
- 22 Feb, 2021 3 commits
- 17 Feb, 2021 2 commits
- 16 Feb, 2021 4 commits
-
-
Gmodena authored
The semantic of the algo output has changed to include all unillustrated aritcles, which includes articles with no matching images.
-
Gmodena authored
-
Miriam Redi authored
Production data etl
-
Miriam Redi authored
-
- 15 Feb, 2021 2 commits
-
-
Miriam Redi authored
Automate generation of .tsv files
-
Gmodena authored
-
- 10 Feb, 2021 4 commits
- 08 Feb, 2021 1 commit
-
-
Gmodena authored
-