Architecture Overview
At the highest level, a Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can replicate data amongst themselves.
In a Pulsar cluster:
- One or more brokers handles and load balances incoming messages from producers, dispatches messages to consumers, communicates with the Pulsar configuration store to handle various coordination tasks, stores messages in BookKeeper instances (aka bookies), relies on a cluster-specific ZooKeeper cluster for certain tasks, and more.
- A BookKeeper cluster consisting of one or more bookies handles persistent storage of messages.
- A ZooKeeper cluster specific to that cluster handles coordination tasks between Pulsar clusters.
The diagram below provides an illustration of a Pulsar cluster:
At the broader instance level, an instance-wide ZooKeeper cluster called the configuration store handles coordination tasks involving multiple clusters, for example geo-replication.
Brokers
The Pulsar message broker is a stateless component that's primarily responsible for running two other components:
- An HTTP server that exposes a REST API for both administrative tasks and topic lookup for producers and consumers
- A dispatcher, which is an asynchronous TCP server over a custom binary protocol used for all data transfers
Messages are typically dispatched out of a managed ledger cache for the sake of performance, unless the backlog exceeds the cache size. If the backlog grows too large for the cache, the broker will start reading entries from BookKeeper.
Finally, to support geo-replication on global topics, the broker manages replicators that tail the entries published in the local region and republish them to the remote region using the Pulsar Java client library.
For a guide to managing Pulsar brokers, see the brokers guide.
Clusters
A Pulsar instance consists of one or more Pulsar clusters. Clusters, in turn, consist of:
- One or more Pulsar brokers
- A ZooKeeper quorum used for cluster-level configuration and coordination
- An ensemble of bookies used for persistent storage of messages
Clusters can replicate amongst themselves using geo-replication.
For a guide to managing Pulsar clusters, see the clusters guide.