• Gmodena's avatar
    T275685 generate production datasets (#7) · 05888e6a
    Gmodena authored
    * Add script to generate and export production datasets
    * Move hql script to ddl
    * Document publish.sh
    * Add some crude metrics reporting
    * Store artifacts and metrics by run identifier
    * Fix variable names
    * Adjust var names, record timestamps in metrics
    * Enable dynamic partitioning
    * Add snapshot partition to production dataset
    * Fix dir name
    * Update publish.sh doc
    * Make virtual env before activationg
    * Fix: confidence_rating to source mapping
    * Add export data summary
    * Update validation notebook with regression cases
    * Add test for confidence mapping
    * Fix. call uuid4 for default dataset_id
    * Fix missing coma in column list
    * Export NULL values as empty strings.
    * Genedate data for all languages
    * Update data export changelog
    * Update data export changelog: set month to March
    * Clean up validation notebook
    * Load validation data from hive
    * Fix character escaping