KAFKA SUMMIT APAC

July 27 - 28, 2021

Kafka Tiered Storage

Date : July 28, 2021

Time : 11:00 AM - 11:45 AM

Kafka is a vital part of data infrastructure in many organizations. When the Kafka cluster grows and more data is stored in Kafka for a longer duration, several issues related to scalability, efficiency, and operations become important to address. Kafka cluster storage is typically scaled by adding more broker nodes to the cluster. But this also adds needless memory and CPUs to the cluster making overall storage cost less efficient compared to storing the older data in external storage. Tiered storage is introduced to extend Kafka's storage beyond the local storage available on the Kafka cluster by retaining the older data in cheaper stores, such as HDFS, S3, Azure or GCS with minimal impact on the internals of Kafka. We will talk about - How tiered storage addresses the above problems and also brings several other advantages. - High level architecture of tiered storage - Future work planned as part of tiered storage.

Speakers

Satish Duggana

Lead - Data/Streaming Infrastructure, Uber

Sriharsha Chintalapani

Senior Staff Engineer, Uber

Privacy Policy | Terms & Conditions,
Apache, Apache Kafka, Kafka, Apache Flink, Flink and associated open source project names are trademarks of the Apache Software Foundation.
The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event
Copyright © Confluent, Inc. 2016 - 2024

#kafkasummit