Apache Kafka is a distributed streaming platform. Kafka 2.0 supports Kerberos authentication, Enabling Kerberos Authentication Using the Wizard on cloudera manager. Courtesy - Apache Kafka Before we start a little about kafka . We think of a streaming platform as having three key capabilities: It lets you publish and subscribe to streams of records. In this respect it is similar to a message queue or enterprise messaging system. It lets you store streams of records in a fault-tolerant way. It lets you process streams of records as they occur. What is Kafka good for? It gets used for two broad classes of application: Building real-time streaming data pipelines that reliably get data between systems or applications Building real-time streaming applications that transform or react to the streams of data To understand how Kafka does these things, let’s dive in and explore Kafka’s capabilities from the bottom up. First a few concepts: Kafka is run as a cluster on o...