A PoC implementation to extend Kafka authorization features to allow for group based ACLs Read more…
by Sönke Liebau on Mar 3, 2018
Everything you were afraid to ask about SAP Vora. Read more…
by Lars Francke on May 8, 2017
A quick reference for Kerberos encryption types Read more…
by Lars Francke on Mar 10, 2017
We'll tell you where you can see us speaking in the coming two months. Read more…
by Lars Francke on Mar 1, 2017
This blog post explains how to get Spark 2.0 Streaming running on a HDP 2.4 cluster against a SSL secured Kafka Read more…
by Oliver Meyn (Guest blog) on Feb 5, 2017
By default Kafka MirrorMaker can only mirror messages from one topic into a topic of the same name on the mirror cluster, but by implementing a custom message handler this behavior can be changed. Read more…
by Sönke Liebau on Jan 31, 2017
A simple process to demonstrate efficient bulk loading into HBase using Spark. The method used does not rely on additional dependencies, and results in a well partitioned HBase table with very high, or complete, data locality. This method should work with any version of Spark or HBase. Read more…
by Tim Robertson (Guest blog) on Oct 27, 2016
A small demo showing how to process Twitter stream data with Kafka Streams. Read more…
by Sönke Liebau on Jul 27, 2016
The way to pass user names to a Hadoop cluster differs subtly depending on whether or not the cluster runs in secure mode. Often, these differences are not fully understood and create issues, so in this post I will explain the basic principles for authentication based on the shared UserGroupInformation class in Hadoop. Read more…
by Lars George on May 19, 2016
Running MapReduce or Spark jobs on YARN that process data in HBase is easy… or so they said until someone added Kerberos to the mix! Learn how using HBase from Spark in cluster mode requires some extra steps to enable secure access to the provided services. Read more…
by Lars George on Mar 18, 2016