Skip to content

Populate Hive tables that will feed Cassandra

Xcollazo requested to merge T311289-xcollazo into T311289-squashed

In this MR we populate the following tables with the section_header column:

suggestions
title_cache
instanceof_cache
search_index_full
search_index_delta

This MR includes many commits, as I wanted to keep the history of @cparle's contributions before I took over his branch at https://gitlab.wikimedia.org/repos/structured-data/image-suggestions/-/tree/T311289.

This MR includes one commit on top of a squashed version of @cparle's T311289 branch. This squashed version is available at T311289-squashed.

As a follow up, we need to make job dependency management easier by exporting this an other jobs data to Hive tables. Will do that separately.

Bug: T328672

Edited by Xcollazo

Merge request reports