← Blog

Apache Kafka: Complete Guide

Architecture, core concepts, real-world use cases & code examples · 14 min read

What Is Apache Kafka?

The distributed event streaming platform behind the world's most data-intensive systems

Apache Kafka is a distributed event streaming platform — think of it as an ultra-fast, fault-tolerant, infinitely scalable message bus that sits at the centre of your data infrastructure. Originally built by engineers at LinkedIn to handle their activity feeds and operational metrics, Kafka was open-sourced in 2011 and donated to the Apache Software Foundation in 2012.

At its simplest: producers write events to Kafka, consumers read those events. But unlike a traditional database or message queue, Kafka stores every event durably on disk for a configurable retention period — so any consumer can replay history, and new consumers can start from the beginning.

Kafka by the numbers

1M+

msg/sec per broker

80%

of Fortune 500

use Kafka in production

7T

msg/day at LinkedIn

<10ms

end-to-end latency

Quick fact

Kafka can retain messages for days, weeks, or forever. This single feature separates it from every traditional message queue — your analytics team can replay last month's events any time.

1

The Problem Kafka Solves

Why existing systems weren't enough

Before Kafka, companies like LinkedIn were drowning in data pipeline complexity. Imagine you have 10 source systems (web servers, mobile apps, databases) and 10 destination systems (Hadoop, search, recommendations, analytics). That's potentially 100 point-to-point connections to build and maintain.

Before Kafka vs After Kafka

❌ Before Kafka
100 point-to-point connections
Each team owns their own pipeline
No replay if consumer goes down
Tight coupling between systems
Data duplicated everywhere
No standard for data flow
✅ With Kafka
One central event bus
Producers/consumers are decoupled
Replay from any point in time
Loose coupling — add consumers freely
Single source of truth
Standard streaming API
2

Core Concepts

Topics, partitions, brokers, producers, consumers — explained clearly

The 6 core Kafka concepts

Broker

A Kafka server. Stores partitions, handles reads/writes. A cluster typically has 3–12 brokers for redundancy.

Topic

A named stream of records — like a database table. Producers write to topics, consumers read from them.

Partition

Physical subdivision of a topic. Each partition is an ordered, append-only log. More partitions = more parallelism.

Producer

An application that publishes (writes) messages to one or more topics. Controls which partition via keys.

Consumer Group

A set of consumers jointly consuming a topic. Each partition goes to exactly one consumer in the group.

Offset

A unique sequential ID for each message within a partition. Consumers track their position via offsets.

3

How Messages Flow Through Kafka

From producer to consumer, step by step

Kafka message lifecycle

Producer

Writes event

Partition

Assigned by key hash

Broker

Persists to disk

Replication

Copies to replicas

Consumer

Reads at own pace

Offset commit

Tracks position

When a producer sends a message, Kafka determines which partition to write to. If a key is specified, all messages with the same key go to the same partition (enabling per-key ordering). The broker appends the message to the partition's log file and replicates it to follower brokers. Consumers poll for new messages and commit their offsets to track progress.

Kafka cluster architecture

3 brokers, 2 topics, 2 consumer groups

Producer Appuser-events
Producer Apppayments
Kafka Cluster3 brokers
Analytics CGreads all events
Search CGreads all events
4

Kafka vs Alternatives

When to use Kafka, RabbitMQ, or Redis Pub/Sub

Kafka vs RabbitMQ vs Redis Pub/Sub

Apache Kafka
Persistent storage (days/weeks)
1M+ msg/sec throughput
Message replay support
Consumer groups
Best for event streaming
Complex setup, high ops cost
RabbitMQ
In-memory (optional disk)
~50K–100K msg/sec
No replay (deleted after ACK)
Competing consumers
Best for task queues
Simple to set up and operate

Rule of thumb

Use Kafka when you need retention, replay, or multiple independent consumers reading the same stream. Use RabbitMQ or SQS for simpler task queue patterns where messages are consumed once and discarded.
5

Kafka History & Timeline

2011

LinkedIn open-sources Kafka

Built by Jay Kreps, Neha Narkhede, and Jun Rao to handle LinkedIn's activity feed and metrics pipeline.

2012

Apache incubation

Kafka becomes a top-level Apache project, accelerating community adoption.

2014

Confluent founded

Original Kafka creators leave LinkedIn to build the commercial Kafka ecosystem (Schema Registry, Kafka Connect, KSQL).

2017

Kafka Streams & Connect mature

Stream processing and connector ecosystem explode. Kafka becomes the backbone of real-time data architectures.

2019

KRaft mode development begins

Effort to remove ZooKeeper dependency — making Kafka simpler to operate.

2022+

Cloud-native Kafka dominates

Confluent Cloud, AWS MSK, Azure Event Hubs (Kafka API) make managed Kafka mainstream.

6

Getting Started: Producer & Consumer Code

Using the kafkajs library for Node.js:

javascriptProducer — publish events to a topic
const { Kafka } = require('kafkajs');

const kafka = new Kafka({
  clientId: 'my-app',
  brokers: ['localhost:9092'],
});

const producer = kafka.producer();

async function publishEvent(userId, action) {
  await producer.connect();

  await producer.send({
    topic: 'user-events',
    messages: [
      {
        key: String(userId),       // ensures same-user events go to same partition
        value: JSON.stringify({
          userId,
          action,
          timestamp: Date.now(),
        }),
      },
    ],
  });

  console.log('Event published');
  await producer.disconnect();
}

publishEvent(42, 'page_view');
javascriptConsumer — read events from a topic
const { Kafka } = require('kafkajs');

const kafka = new Kafka({ clientId: 'my-app', brokers: ['localhost:9092'] });
const consumer = kafka.consumer({ groupId: 'analytics-group' });

async function startConsuming() {
  await consumer.connect();
  await consumer.subscribe({ topic: 'user-events', fromBeginning: false });

  await consumer.run({
    eachMessage: async ({ topic, partition, message }) => {
      const event = JSON.parse(message.value.toString());
      console.log({
        partition,
        offset: message.offset,
        event,
      });
      // Process event: update analytics, trigger notifications, etc.
    },
  });
}

startConsuming().catch(console.error);

Run Kafka locally with Docker

docker run -d --name kafka -p 9092:9092 apache/kafka:latest — spins up a single-broker Kafka in seconds for local development.
7

Real-World Use Cases

Real-time analytics

Stream user clickstream data into analytics pipelines. Netflix uses Kafka to process 500B+ events/day for their recommendation engine.

Event sourcing

Store every state change as an immutable event. Rebuild any system state by replaying the event log. Used in banking and fintech.

Microservice communication

Decouple services via async events instead of synchronous REST calls. Services publish events; others react. No direct dependencies.

Log aggregation

Centralise logs from hundreds of services into Kafka, then ship to Elasticsearch, S3, or Splunk. Uber aggregates logs from 4,000+ microservices.

CDC (Change Data Capture)

Capture every database row change using Kafka Connect + Debezium. Sync databases, build read models, invalidate caches in real time.

IoT data ingestion

Handle millions of sensor readings per second. Kafka's partitioning allows horizontal scaling to match device count growth.

8

Key Kafka Metrics to Monitor

Production Kafka — what to watch

1

Consumer lag

Critical

The number of messages a consumer is behind the latest offset. High lag = consumers can't keep up with producers. Target: < 1000 messages.

2

Under-replicated partitions

Critical

Partitions where not all replicas are in sync. Should always be 0 in a healthy cluster. Non-zero means a broker is falling behind.

3

Request rate & throughput

Important

Messages/sec and bytes/sec per broker. Use this to plan capacity and detect traffic spikes before they cause problems.

4

Disk usage per broker

Important

Kafka writes to disk continuously. Monitor per-broker disk utilisation and set retention policies (time or size based) to prevent full disks.

5

Producer send latency

Monitor

P99 latency for producer sends. Spikes indicate broker pressure or network issues. Configure acks=all for durability vs acks=1 for speed.

9

Common Kafka Mistakes

Too few partitions

Setting too few partitions at topic creation limits your max consumer parallelism. You can increase partitions but not decrease them. Plan for 2–4x your current consumer count.

Not setting retention policies

Default retention is 7 days. For high-volume topics, this can fill disks quickly. Set retention.ms and retention.bytes per topic based on actual storage budgets.

Using Kafka as a database

Kafka is not designed for random reads by key. If you need to look up a specific record, use a proper database. Kafka is optimised for sequential append and sequential read — that's where its performance comes from.

Best practice: use keys for ordering

Always set a message key for entities that need per-entity ordering (e.g., userId, orderId). This guarantees all events for the same entity land on the same partition and are processed in order.

Frequently Asked Questions

Share this article with Your Friends, Collegue and Team mates

Stay Updated

Get the latest tool updates, new features, and developer tips delivered to your inbox.

Occasional useful updates only. Unsubscribe in one click — we never sell your email.

Feedback for Apache Kafka Complete Guide

Tell us what's working, what's broken, or what you wish we built next — it directly shapes our roadmap.

You make the difference

Good feedback is gold — a rough edge you hit today could be smoother for everyone tomorrow.

  • Feature ideas often jump the queue when lots of you ask.
  • Bug reports with steps get fixed faster — paste URLs or examples if you can.
  • Name and email are optional; we won't use them for anything except replying if needed.

Related Data Engineering Guides

Continue with closely related troubleshooting guides and developer workflows.