Kafka Powered Near Real-Time Data Pipelines @ Extreme Scale
Date : September 14, 2021
Time : 01:00 PM - 02:00 PM

We propose to present the details of innovative, customized, and scalable data integration and processing pipeline powered by Kafka. We plan on detailing the design principles we have implemented to achieve high performance at extreme scale. Our data processing pipeline is equipped with a custom threading model that enables dynamic auto scaling (horizontal and vertical) in our Kafka based architecture to provide processing elasticity in case of surges in data volumes. This architecture also provides randomized and yet functionally consistent assignment of incoming data to partitions and consumer threads to ensure an even distribution of workload while maintaining data integrity and FIFO processing. These design principles help us achieve the core business operations and to provide insights and predictive analysis in near real-time. Finally, we plan on laying out our future vision for this platform that will touch upon data domains, interoperability, and multi-modal data.

Speakers
speakerimage
Murali Kannan
Senior Manager, AFS