Skip to main content

Posts

Kafka Kerberos Enable and Testing.

Apache Kafka is a distributed streaming platform. Kafka 2.0 supports Kerberos authentication, Enabling Kerberos Authentication Using the Wizard on cloudera manager. Courtesy - Apache Kafka Before we start a little about kafka . We think of a streaming platform as having three key capabilities: It lets you publish and subscribe to streams of records. In this respect it is similar to a message queue or enterprise messaging system. It lets you store streams of records in a fault-tolerant way. It lets you process streams of records as they occur. What is Kafka good for? It gets used for two broad classes of application: Building real-time streaming data pipelines that reliably get data between systems or applications Building real-time streaming applications that transform or react to the streams of data To understand how Kafka does these things, let’s dive in and explore Kafka’s capabilities from the bottom up. First a few concepts: Kafka is run as a cluster on o
Recent posts

Cloudera Manager - Duplicate entry 'zookeeper' for key 'NAME'.

We had recently built a cluster using cloudera API’s and had all the services running on it with Kerberos enabled. Next we had a requirement to add another kafka cluster to our already exsisting cluster in cloudera manager. Since it is a quick task to get the zookeeper and kafka up and running. We decided to get this done using the cloudera manager instead of the API’s. But we faced the Duplicate entry 'zookeeper' for key 'NAME' issue as described in the bug below. https://issues.cloudera.org/browse/DISTRO-790 I have set up two clusters that share a Cloudera Manger. The first I set up with the API and created the services with capital letter names, e.g., ZOOKEEPER, HDFS, HIVE. Now, I add the second cluster using the Wizard. Add Cluster->Select Hosts->Distribute Parcels->Select base HDFS Cluster install On the next page i get SQL errros telling that the services i want to add already exist. I suspect that the check for existing service names does n

Parcel Not Distributing Cloudera CDH.

We were deploying one of the cluster on our lab environment which is used by everyone. So the lab has it own share of stale information on it. During installation we notice that the distribution is not working. There could be couple of reasons. This was the second time we are having this issue. Check /etc/hosts file if we have all the server names added correctly Second reason would be due to the fact that, one of the installation was terminated midway, leaving a stale config which set the status to ACTIVATING for CDH parcel. So when we try to install, parcel was not distributing. Again there could be similar issue if we do not have enough space on the node for /opt/cloudera . Solution: Deactivating parcel and retry. curl -u username:password -X POST http://adminnode: 7180 /api/v14/clusters/ /parcels/products/CDH/versions/ 5.10 .0 - 1. cdh5 .10 .0 .p0 .41 /commands/deactivate Check for space and increase space for /opt/cloudera/ Most of the time should see th