What Is Apache Kafka?
The distributed event streaming platform behind the world's most data-intensive systems
Apache Kafka is a distributed event streaming platform — think of it as an ultra-fast, fault-tolerant, infinitely scalable message bus that sits at the centre of your data infrastructure. Originally built by engineers at LinkedIn to handle their activity feeds and operational metrics, Kafka was open-sourced in 2011 and donated to the Apache Software Foundation in 2012.
At its simplest: producers write events to Kafka, consumers read those events. But unlike a traditional database or message queue, Kafka stores every event durably on disk for a configurable retention period — so any consumer can replay history, and new consumers can start from the beginning.
Kafka by the numbers
1M+
msg/sec per broker
80%
of Fortune 500
use Kafka in production
7T
msg/day at LinkedIn
<10ms
end-to-end latency
Quick fact
Kafka can retain messages for days, weeks, or forever. This single feature separates it from every traditional message queue — your analytics team can replay last month's events any time.
The Problem Kafka Solves
Why existing systems weren't enough
Before Kafka, companies like LinkedIn were drowning in data pipeline complexity. Imagine you have 10 source systems (web servers, mobile apps, databases) and 10 destination systems (Hadoop, search, recommendations, analytics). That's potentially 100 point-to-point connections to build and maintain.
Before Kafka vs After Kafka
Core Concepts
Topics, partitions, brokers, producers, consumers — explained clearly
The 6 core Kafka concepts
Broker
A Kafka server. Stores partitions, handles reads/writes. A cluster typically has 3–12 brokers for redundancy.
Topic
A named stream of records — like a database table. Producers write to topics, consumers read from them.
Partition
Physical subdivision of a topic. Each partition is an ordered, append-only log. More partitions = more parallelism.
Producer
An application that publishes (writes) messages to one or more topics. Controls which partition via keys.
Consumer Group
A set of consumers jointly consuming a topic. Each partition goes to exactly one consumer in the group.
Offset
A unique sequential ID for each message within a partition. Consumers track their position via offsets.
How Messages Flow Through Kafka
From producer to consumer, step by step
Kafka message lifecycle
Producer
Writes event
Partition
Assigned by key hash
Broker
Persists to disk
Replication
Copies to replicas
Consumer
Reads at own pace
Offset commit
Tracks position
When a producer sends a message, Kafka determines which partition to write to. If a key is specified, all messages with the same key go to the same partition (enabling per-key ordering). The broker appends the message to the partition's log file and replicates it to follower brokers. Consumers poll for new messages and commit their offsets to track progress.
Kafka cluster architecture
3 brokers, 2 topics, 2 consumer groups
Kafka vs Alternatives
When to use Kafka, RabbitMQ, or Redis Pub/Sub
Kafka vs RabbitMQ vs Redis Pub/Sub
Rule of thumb
Kafka History & Timeline
LinkedIn open-sources Kafka
Built by Jay Kreps, Neha Narkhede, and Jun Rao to handle LinkedIn's activity feed and metrics pipeline.
Apache incubation
Kafka becomes a top-level Apache project, accelerating community adoption.
Confluent founded
Original Kafka creators leave LinkedIn to build the commercial Kafka ecosystem (Schema Registry, Kafka Connect, KSQL).
Kafka Streams & Connect mature
Stream processing and connector ecosystem explode. Kafka becomes the backbone of real-time data architectures.
KRaft mode development begins
Effort to remove ZooKeeper dependency — making Kafka simpler to operate.
Cloud-native Kafka dominates
Confluent Cloud, AWS MSK, Azure Event Hubs (Kafka API) make managed Kafka mainstream.
Getting Started: Producer & Consumer Code
Using the kafkajs library for Node.js:
const { Kafka } = require('kafkajs');
const kafka = new Kafka({
clientId: 'my-app',
brokers: ['localhost:9092'],
});
const producer = kafka.producer();
async function publishEvent(userId, action) {
await producer.connect();
await producer.send({
topic: 'user-events',
messages: [
{
key: String(userId), // ensures same-user events go to same partition
value: JSON.stringify({
userId,
action,
timestamp: Date.now(),
}),
},
],
});
console.log('Event published');
await producer.disconnect();
}
publishEvent(42, 'page_view');const { Kafka } = require('kafkajs');
const kafka = new Kafka({ clientId: 'my-app', brokers: ['localhost:9092'] });
const consumer = kafka.consumer({ groupId: 'analytics-group' });
async function startConsuming() {
await consumer.connect();
await consumer.subscribe({ topic: 'user-events', fromBeginning: false });
await consumer.run({
eachMessage: async ({ topic, partition, message }) => {
const event = JSON.parse(message.value.toString());
console.log({
partition,
offset: message.offset,
event,
});
// Process event: update analytics, trigger notifications, etc.
},
});
}
startConsuming().catch(console.error);Run Kafka locally with Docker
Real-World Use Cases
Real-time analytics
Stream user clickstream data into analytics pipelines. Netflix uses Kafka to process 500B+ events/day for their recommendation engine.
Event sourcing
Store every state change as an immutable event. Rebuild any system state by replaying the event log. Used in banking and fintech.
Microservice communication
Decouple services via async events instead of synchronous REST calls. Services publish events; others react. No direct dependencies.
Log aggregation
Centralise logs from hundreds of services into Kafka, then ship to Elasticsearch, S3, or Splunk. Uber aggregates logs from 4,000+ microservices.
CDC (Change Data Capture)
Capture every database row change using Kafka Connect + Debezium. Sync databases, build read models, invalidate caches in real time.
IoT data ingestion
Handle millions of sensor readings per second. Kafka's partitioning allows horizontal scaling to match device count growth.
Key Kafka Metrics to Monitor
Production Kafka — what to watch
Consumer lag
CriticalThe number of messages a consumer is behind the latest offset. High lag = consumers can't keep up with producers. Target: < 1000 messages.
Under-replicated partitions
CriticalPartitions where not all replicas are in sync. Should always be 0 in a healthy cluster. Non-zero means a broker is falling behind.
Request rate & throughput
ImportantMessages/sec and bytes/sec per broker. Use this to plan capacity and detect traffic spikes before they cause problems.
Disk usage per broker
ImportantKafka writes to disk continuously. Monitor per-broker disk utilisation and set retention policies (time or size based) to prevent full disks.
Producer send latency
MonitorP99 latency for producer sends. Spikes indicate broker pressure or network issues. Configure acks=all for durability vs acks=1 for speed.
Common Kafka Mistakes
Too few partitions
Not setting retention policies
retention.ms and retention.bytes per topic based on actual storage budgets.Using Kafka as a database
Best practice: use keys for ordering
userId, orderId). This guarantees all events for the same entity land on the same partition and are processed in order.Frequently Asked Questions
Share this article with Your Friends, Collegue and Team mates
Stay Updated
Get the latest tool updates, new features, and developer tips delivered to your inbox.
Occasional useful updates only. Unsubscribe in one click — we never sell your email.
Feedback for Apache Kafka Complete Guide
Tell us what's working, what's broken, or what you wish we built next — it directly shapes our roadmap.
Good feedback is gold — a rough edge you hit today could be smoother for everyone tomorrow.
- Feature ideas often jump the queue when lots of you ask.
- Bug reports with steps get fixed faster — paste URLs or examples if you can.
- Name and email are optional; we won't use them for anything except replying if needed.