Near Real time Streaming and Data Processing of Ceilometer data using Kafka

The planetary boundary layer (PBL) is the lowest part of the atmosphere, ranging anywhere between 100 and 2000 m above the surface of the ground. The planetary boundary layer height (PBLH) plays a vital role in the environment-related study of air pollutants. We are building an ecosystem to determine near real-time the planetary boundary layer height all across the United States. The instrument called Ceilometer (CEIL) is a self-contained, ground-based, active, remote-sensing device designed to measure cloud-base height, vertical visibility, and potential backscatter signals by aerosols. Data from Ceilometer is used to determine PBLH. Using Kafka we are collecting streaming the data from seven ceilometers sites across the United States. Kafka is providing a framework which allows us to automatically store the data into Kafka topics for streaming and then preprocessing the data and then feeds the data to Machine Learning model to make the relevant prediction.

Rahul Gite
Graduate Research Assistant, University of Maryland Baltimore County