Apache Kafka Training

Apache Kafka Training Overview

Download PDF

Apache Kafka is used to building robust data storage systems that can effectively be combined with other Big Data frameworks, including Apache Hadoop and Apache Spark. The course discusses how to use Kafka efficiently and offers practical solutions to common problems commonly encountered by developers and administrators when dealing with it. The course begins with an architectural description of Apache Kafka and discusses the relevant principles. First, the course discusses the Producer API, concentrating on how it allows an application to publish a source of information on one or more Kafka topics. Consumer API follows on.

The course explores how to use the Consumer API to authorize an application to subscribe to one or more topics Once these basic concepts are explored, advanced concepts are explained, such as data serialization using Avro, and data partitioning based on custom logic. Eventually, it addresses Kafka’s best practices, along with some practical issues.

Apache Kafka Training Objective

  • Describe the architecture of Kafka
  • Explore Kafka producers and consumers for writing and reading messages
  • Understand publish-subscribe messaging and how it fits in the Data ecosystem
  • Explore how Kafka’s stream delivery capabilities make it a perfect source for stream processing systems
  • Learn various strategies of monitoring Kafka
  • Learn best practices for building data pipelines and applications with Kafka
  • Know how to run Kafka as a cluster on one or more servers that can span multiple data centers

Apache Kafka Training Audience

Anybody who wants to understand how messaging queue works and enterprise integration with messages. 

Apache Kafka Training Prerequisites

It is recommended to get experience in at least one of the programming languages such as Java Python or Scala. A general understanding of streaming and distributed computing technologies will be beneficial but not needed.

Apache Kafka Training Outline

  • Kafka’s Origin
  • Installing Kafka
    • Installing Java and Zookeeper
    • Installing a Kafka Broker
    • Broker Configuration
    • Hardware Selection
    • Different Versions of Kafka
  • The New Kafka Architecture (Without the Zookeeper)
  • How to Do Migrations?
  • Working with Multiple Producers and Consumers
  • Sending Events to Kafka – Producer API
  • Asynchronous Send
  • Reading Events from Kafka – Consumer API
  • Broker Configurations:
  • Creating Multiple Brokers and Checking How Messages in Topics Will Be Routed to the Brokers
  • Kafka Producer API
  • Writing a Custom Kafka Producer and Understanding What a ProducerRecord Is
  • Working with a Custom Kafka Consumer API
  • Writing a Custom Kafka Consumer and Understanding What a ConsumerRecord Is
  • Consumer Pool Loop – Offset Management
  • Rebalancing of Consumers
  • How to Serialize Data Using Avro
  • Serializers
  • How to Implement Custom Serializers
  • Serializing Using Apache Avro
  • Using Avro Records with Kafka
  • Electing Partition Leaders – Kafka Controller Component
  • Benefits of Data Partitioning Among Brokers
  • Partitioning of Topics – Implementing Custom Practitioner
  • Writing Custom Partitioner for Specific Partitioning
  • Data Replication in Kafka
  • Append-Only Distribution Log – Storing Events in Kafka
  • Compaction Process
  • Writing a Custom Partitioner and Checking How Messages Get Partitioned Based on the Custom Logic of the Partitioner
  • Customized Offset Management in Kafka
  • Writing Code for Getting a Specific Offset of a Message
  • Broker Health Monitoring
  • Kafka and Metrics Reporters
  • Monitor Under-Replicated Partitions
  • Monitor Events
  • Performance Tuning
  • How to Check for Metrics in Kafka
  • Practical Use Cases
  • Practical Considerations


[miniorange_social_login shape="longbuttonwithtext" theme="default" space="4" width="300" height="50" color="000000"]