What is Apache Kafka and When to Use It [2026]

Q: What is Apache Kafka?

Apache Kafka is a distributed event streaming platform — a high-throughput, durable, fault-tolerant publish-subscribe message system. Producers publish events (messages) to topics. Consumers subscribe to topics and process events. Unlike traditional message queues, Kafka retains messages for a configurable period (days/weeks) and allows multiple consumer groups to independently read the same stream. Originally built at LinkedIn, open-sourced in 2011, now used by 80%+ of Fortune 100 companies.

Q: What is the difference between a topic and a partition?

A topic is a named log of events (like a database table for events). A partition is a subdivision of a topic — each topic is split into N partitions for parallelism. Events in a partition are ordered and immutable. Producers write to a topic (Kafka routes to a partition). Consumers read from partitions. More partitions = more parallelism = higher throughput. Events with the same key always go to the same partition (ordering guarantee within a key).

Q: What is a consumer group?

A consumer group is a set of consumers that collaborate to consume a topic. Each partition is consumed by exactly ONE consumer in a group at a time — this enables parallel processing. If you have a topic with 6 partitions and 3 consumers in a group, each consumer reads 2 partitions. Different consumer groups each get their own independent cursor — they consume the same events independently. This is how multiple downstream systems can all process the same event stream.

Q: When should I use Kafka vs RabbitMQ?

Kafka: high throughput (millions of messages/sec), event streaming and log, message replay/retention, multiple consumers reading the same stream, event sourcing, audit logs. RabbitMQ: traditional task queue (work distribution among workers), complex routing (exchanges, bindings), messages that should be deleted after consumption, lower throughput but more routing flexibility. Rule of thumb: if you need a task queue where messages are consumed and gone, RabbitMQ. If you need a durable event stream that many systems read from, Kafka.

Q: How does Kafka ensure durability and fault tolerance?

Each partition is replicated across multiple brokers (replication factor). A partition has one leader (handles reads/writes) and N-1 followers (replicate data). If the leader fails, a follower is elected as the new leader automatically. Kafka persists all messages to disk — durability comes from disk storage, not memory. Producers can configure acks: acks=all means the leader waits for all replicas to confirm before acknowledging the write. This makes Kafka extremely durable.

Q: When should I NOT use Kafka?

Do not use Kafka for: low-volume messaging where a simpler queue (Redis Pub/Sub, SQS) suffices, scenarios requiring complex message routing logic (RabbitMQ exchanges are better), when you need messages to disappear after processing and do not want retention overhead, when your team lacks Kafka expertise (operational complexity is high), or for small projects where the infrastructure overhead exceeds the benefit. Kafka shines at scale — for low-volume apps, the complexity is rarely justified.

What Problem Does Kafka Solve?

In a traditional microservices architecture, when an order is placed, the Order Service must synchronously notify: Inventory Service, Payment Service, Notification Service, Analytics Service. Each call could fail. If Notification Service is down, the order fails. Services are tightly coupled.

With Kafka, the Order Service publishes one event: "OrderPlaced". All downstream services subscribe to this event independently. If Notification Service is down, it catches up on missed events when it restarts. The Order Service does not care who is listening — it just publishes and moves on.

Kafka Architecture

Kafka architecture overview Producers Kafka Cluster Consumers ┌──────────┐ ┌─────────────────────┐ ┌──────────────────┐ │ Order │──publish──► │ Topic: "orders" │◄──read── │ Inventory Service│ │ Service │ │ ┌─────────────────┐│ │ (Consumer Group A)│ └──────────┘ │ │ Partition 0 ││ └──────────────────┘ ┌──────────┐ │ │ [msg1][msg2]... ││ ┌──────────────────┐ │ Payment │──publish──► │ ├─────────────────┤│◄──read── │ Analytics Service│ │ Service │ │ │ Partition 1 ││ │ (Consumer Group B)│ └──────────┘ │ │ [msg3][msg4]... ││ └──────────────────┘ │ └─────────────────┘│ │ Retention: 7 days │ └─────────────────────┘

Core Concepts

Topics and Partitions

A topic is a named log — like a table in a database. You publish events to topics and subscribe to them. A topic is split into partitions — ordered, immutable sequences of events. Partitions enable parallelism: multiple consumers can read different partitions simultaneously.

Topic with 3 partitions Topic: "user-events" Partition 0: [login:user1] [purchase:user3] [logout:user1] ... Partition 1: [login:user2] [purchase:user4] [login:user5] ... Partition 2: [signup:user6] [profile_update:user2] ... Offset: each message has an offset (sequential ID) within its partition Consumer tracks its own offset — knows where it left off

Producers

Producers publish messages to Kafka topics. They can specify a partition key — messages with the same key always go to the same partition. This guarantees ordering for related events (all events for the same user_id go to the same partition, in order).

Consumers and Consumer Groups

A consumer group is a set of consumers that collectively read a topic. Each partition is assigned to exactly one consumer in the group. Adding more consumers to the group increases parallelism (each consumer reads fewer partitions). Multiple groups can independently read the same topic.

Consumer group partition assignment Topic: "orders" — 6 partitions Consumer Group "inventory-service" — 3 consumers: Consumer 1 reads: Partition 0, Partition 1 Consumer 2 reads: Partition 2, Partition 3 Consumer 3 reads: Partition 4, Partition 5 Consumer Group "analytics-service" — 2 consumers: Consumer A reads: Partition 0, Partition 1, Partition 2 Consumer B reads: Partition 3, Partition 4, Partition 5 (Completely independent — same messages, different offsets)

Kafka vs RabbitMQ vs Redis Pub/Sub

Property	Kafka	RabbitMQ	Redis Pub/Sub
Pattern	Log-based streaming	Message queue (AMQP)	Fire-and-forget pub/sub
Message retention	Yes (days/weeks)	Until consumed	No — lost if no subscriber
Message replay	Yes (rewind offset)	No	No
Multiple consumers	Yes (consumer groups)	Competing consumers (one gets it)	All subscribers get it
Throughput	Millions/sec	Thousands/sec	Very high (in-memory)
Ordering	Per-partition guaranteed	Per-queue guaranteed	No guarantee
Complexity	High	Medium	Low
Best for	Event streaming, audit logs, analytics	Task queues, work distribution	Real-time notifications, caching

Common Kafka Use Cases

Microservices integration: Services communicate via events — decoupled, resilient, independently scalable
Event sourcing: Store every state change as an event — rebuild state by replaying events
Real-time analytics: Stream events to Apache Flink, Spark Streaming, or ksqlDB for real-time aggregation
Log aggregation: Centralise application logs from hundreds of services into Kafka, then ship to Elasticsearch/S3
Change Data Capture (CDC): Kafka Connect + Debezium streams every database row change as an event
Activity tracking: LinkedIn uses Kafka to track user activity (page views, clicks) at 7 trillion messages/day

Managed Kafka Services

Running Kafka yourself is operationally complex. Managed services simplify this significantly: Confluent Cloud (Kafka as a Service), Amazon MSK (Managed Streaming for Kafka), Azure Event Hubs (Kafka-compatible), Redpanda (Kafka-compatible, much simpler ops). For most teams, a managed service is the right starting point.

Kafka Is Not a Database

Kafka stores events durably but is not queryable like a database. You cannot run SELECT queries on a Kafka topic. For querying event history, ship events to a data warehouse (Snowflake, BigQuery) or use ksqlDB for stream processing. Kafka is a transport and temporary storage layer — not a long-term data store for complex queries.

How We Research and Update This Guide

We test the underlying formula or workflow, compare outputs with reliable references, and revise examples whenever the page content changes.

The workflow or formula is tested directly in the tool and compared against independent reference examples.
Examples are kept practical so readers can verify the result without hidden assumptions.
Pages are revised whenever the interface, calculation flow, or surrounding guidance materially changes.

Frequently Asked Questions — Apache Kafka

What is Apache Kafka?

What is the difference between a topic and a partition?

What is a consumer group?

When should I use Kafka vs RabbitMQ?

How does Kafka ensure durability and fault tolerance?

When should I NOT use Kafka?