Developing Custom Transformation in the Kafka Connect to Minimize Data Redundancy
Date : September 15, 2021
Time : 11:00 AM - 11:15 AM

Compacted topics grow over time and are often utilizing high performance, low latency and relatively expensive storage solutions. Reducing duplicated data plays a critical role in the size of compacted topics. with less data on the topics, the Kafka cluster consumes less disk space which in turn it leads to lower operation cost. in this use case-driven talk, we are going to demonstrate how our team at UnitedHealth Group leveraged existing transformers to extract data from the message metadata in the topic as well as how we developed our customized transformers to minimize the amount of duplicated data in each message in the topic.

Siavash Sedghi
Lead Software Engineer, United Health Group