Cost-effective GraphQL Queries against Kafka Topics at scale

In our projects, we often have to query the content of Kafka topics. To that end, we expose REST-APIs based on Kafka Streams’ interactive queries. However, this approach has some shortcomings. For example, users must stitch various APIs and results together. Furthermore, it can become costly as each topic’s API requires one or more JVMs. In this talk, we show how GraphQL can serve single queries involving multiple Kafka topics returning only data the user requested. Our approach eliminates unnecessary overhead and the lack of flexibility associated with traditional API-approaches on Kafka topics. We will also highlight different ways to reduce costs for computational resources such as CPU and RAM. First, we introduce more efficient queries through smart sub-query routing. Second, we build an ahead-of-time compiled, self-contained executable with GraalVM's Native Image and compare it to the traditionally packaged JAR regarding memory usage and performance for different query workloads.

Torben Meyer
Data Engineer, bakdata GmbH