KAFKA SUMMIT LONDON

April 25 - 26, 2022

Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink in Apache Flink

Date : April 26, 2022

Time : 11:00 AM - 11:45 AM

Apache Kafka is one of the most commonly used connectors with Apache Flink for exactly-once streaming use cases. The combination of both systems allows you to build mission-critical systems that require low end-to-end latency and exactly-once processing eg. banks processing transactions. In Apache Flink 1.14, we released a new KafkaSink based on Apache Flink’s unified Sink interface that natively supports streaming and batch executions. However, we needed to stretch Kafka’s transactions API to fully support exactly-once processing in Flink. In this talk, we will start with a quick recap of Apache Kafka’s transactions and Flink’s checkpointing mechanism. Then, we describe the two-phase commit protocol implemented in KafkaSink in-depth and emphasize the difficulties we have overcome when applying Kafka’s transaction API to longer-lasting transactions. We explain how we ensure performant writing to Apache Kafka and how the KafkaSink recovery works. In summary, this talk should give users a deep dive into how Apache Flink leverages Apache Kafka’s transactions and developers an overview of what they have to consider when using Apache Kafka’s transactions.

Speakers

Fabian Paul

Software Engineer, Ververica

Privacy Policy | Terms & Conditions,
Apache, Apache Kafka, Kafka, Apache Flink, Flink and associated open source project names are trademarks of the Apache Software Foundation.
The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event
Copyright © Confluent, Inc. 2016 - 2024

#kafkasummit