Skip to main content
Version: 2.10.0

Pulsar Terminology

Here is a glossary of terms related to Apache Pulsar:

Conceptsโ€‹

Pulsarโ€‹

Pulsar is a distributed messaging system originally created by Yahoo but now under the stewardship of the Apache Software Foundation.

Messageโ€‹

Messages are the basic unit of Pulsar. They're what producers publish to topics and what consumers then consume from topics.

Topicโ€‹

A named channel used to pass messages published by producers to consumers who process those messages.

Partitioned Topicโ€‹

A topic that is served by multiple Pulsar brokers, which enables higher throughput.

Namespaceโ€‹

A grouping mechanism for related topics.

Namespace Bundleโ€‹

A virtual group of topics that belong to the same namespace. A namespace bundle is defined as a range between two 32-bit hashes, such as 0x00000000 and 0xffffffff.

Tenantโ€‹

An administrative unit for allocating capacity and enforcing an authentication/authorization scheme.

Subscriptionโ€‹

A lease on a topic established by a group of consumers. Pulsar has four subscription modes (exclusive, shared, failover and key_shared).

Pub-Subโ€‹

A messaging pattern in which producer processes publish messages on topics that are then consumed (processed) by consumer processes.

Producerโ€‹

A process that publishes messages to a Pulsar topic.

Consumerโ€‹

A process that establishes a subscription to a Pulsar topic and processes messages published to that topic by producers.

Readerโ€‹

Pulsar readers are message processors much like Pulsar consumers but with two crucial differences:

  • you can specify where on a topic readers begin processing messages (consumers always begin with the latest available unacked message);
  • readers don't retain data or acknowledge messages.

Cursorโ€‹

The subscription position for a consumer.

Acknowledgment (ack)โ€‹

A message sent to a Pulsar broker by a consumer that a message has been successfully processed. An acknowledgement (ack) is Pulsar's way of knowing that the message can be deleted from the system; if no acknowledgement, then the message will be retained until it's processed.

Negative Acknowledgment (nack)โ€‹

When an application fails to process a particular message, it can send a "negative ack" to Pulsar to signal that the message should be replayed at a later timer. (By default, failed messages are replayed after a 1 minute delay). Be aware that negative acknowledgment on ordered subscription types, such as Exclusive, Failover and Key_Shared, can cause failed messages to arrive consumers out of the original order.

Unacknowledgedโ€‹

A message that has been delivered to a consumer for processing but not yet confirmed as processed by the consumer.

Retention Policyโ€‹

Size and time limits that you can set on a namespace to configure retention of messages that have already been acknowledged.

Multi-Tenancyโ€‹

The ability to isolate namespaces, specify quotas, and configure authentication and authorization on a per-tenant basis.

Architectureโ€‹

Standaloneโ€‹

A lightweight Pulsar broker in which all components run in a single Java Virtual Machine (JVM) process. Standalone clusters can be run on a single machine and are useful for development purposes.

Clusterโ€‹

A set of Pulsar brokers and BookKeeper servers (aka bookies). Clusters can reside in different geographical regions and replicate messages to one another in a process called geo-replication.

Instanceโ€‹

A group of Pulsar clusters that act together as a single unit.

Geo-Replicationโ€‹

Replication of messages across Pulsar clusters, potentially in different datacenters or geographical regions.

Configuration Storeโ€‹

Pulsar's configuration store (previously known as configuration store) is a ZooKeeper quorum that is used for configuration-specific tasks. A multi-cluster Pulsar installation requires just one configuration store across all clusters.

Topic Lookupโ€‹

A service provided by Pulsar brokers that enables connecting clients to automatically determine which Pulsar cluster is responsible for a topic (and thus where message traffic for the topic needs to be routed).

Service Discoveryโ€‹

A mechanism provided by Pulsar that enables connecting clients to use just a single URL to interact with all the brokers in a cluster.

Brokerโ€‹

A stateless component of Pulsar clusters that runs two other components: an HTTP server exposing a REST interface for administration and topic lookup and a dispatcher that handles all message transfers. Pulsar clusters typically consist of multiple brokers.

Dispatcherโ€‹

An asynchronous TCP server used for all data transfers in-and-out a Pulsar broker. The Pulsar dispatcher uses a custom binary protocol for all communications.

Storageโ€‹

BookKeeperโ€‹

Apache BookKeeper is a scalable, low-latency persistent log storage service that Pulsar uses to store data.

Bookieโ€‹

Bookie is the name of an individual BookKeeper server. It is effectively the storage server of Pulsar.

Ledgerโ€‹

An append-only data structure in BookKeeper that is used to persistently store messages in Pulsar topics.

Functionsโ€‹

Pulsar Functions are lightweight functions that can consume messages from Pulsar topics, apply custom processing logic, and, if desired, publish results to topics.