Parameterization of stream_manager
-
stream_manager config is now loaded by jsonargparse instantiate_classes.
-
stream_manager API is now one of: -- stream_manager(**kwargs) -- stream_manager.from_config(load_config()) -- custom_parser = ArgumentParser custom_parser.add_argument("--custom_arg1", type=str) config = load_config(custom_parser) stream_manager.from_config(config)
-
FlinkStreamPipeline.Builder is gone
-
ConnectorDescriptor is a simple data class that encompasses stream descriptors, connector names (e.g. kafka), and other options needed for instantiating a connector.
-
SourceDescriptor and SinkDescriptor extend from ConnectorDescriptor, and can use Flink env and EventDataStreamFactory to instantiate sources, datatstreams, and sinks and sink to.
-
stream_manager is now parameterized with config and descriptors
-
This allow us to use jsonargparse add_class_arguments to automatically generate an argparser for CLI and config files for stream_manager.
-
Pipeline developers can add additional args to the returned argparser. E.g. mediawiki_api_endpoint, etc.
-
Small refactor that adds a new stream_manager
auto_error_destination_enabled
param, instead of relying on boolean union type oferror_destination
.
Bug: T328478