This MR introduces the following changes:
- Adds gitlab-ci, maven, scala and gitignore configs.
- Targets JDK8, which is the version YARN runs on. Note: JDK8 is deprecated as of Flink 1.15, and support will be removed in the future.
- Adds dependencies on the newly released Flink 1.15 and
flink-scala
2.12 (see below). See the release notes for more information. - Adds a basic Gitlab CI pipeline for running tests. Artifact publishing is out of scope for now, but integration with
maven-publish
is trivial.
Project config
The maven config for this project is based atop:
- https://github.com/wikimedia/wikimedia-discovery-discovery-parent-pom
- https://github.com/wikimedia/wikidata-query-rdf
Scala Free Flink
Flink 1.15 is a scala free release:
All Scala dependencies are now isolated to the flink-scala jar. To remove Scala from the user-code classpath, remove this jar from the lib directory of the Flink distribution.
In our project we could drop support for the flink-scala
package, and wrap the Java API directly. This mailing list thread discusses possible approaches, and the future of scala support in Flink:
https://lists.apache.org/list?user@flink.apache.org:lte=1M:Practical%20guidance%20with%20Scala%20and%20Flink
While wrappers already exist that support modern scala versions (2.13/3.x), this MR boilerplate still has deps on the flink-scala
package. Once we start writing code, we could look at wrapping Java directly (ideally with Scala 3).
@tchin if you are keen on experimenting, maybe this is something we could kick off in the context of the image suggestion feedback job.
Scala wrappers to consider (from the mailing-list thread thread):
TODOs:
-
review Scala version and deps management in 1.15
Tagging @tchin @otto @joal @dcausse as (optional) reviewers.