Version: Next

Scalable topics

A scalable topic is a topic that adjusts its own capacity at runtime. Instead of being created with a fixed number of partitions, it is internally divided into key-range segments that the broker splits when load grows and merges when load drops -- with no downtime, no client changes, and no loss of per-key ordering. A scalable topic is addressed with the topic:// scheme:

topic://tenant/namespace/name

note

Scalable topics are the recommended choice for new applications. Classic partitioned and non-partitioned topics remain fully supported and are the right choice for existing applications and for clients that do not yet support the V5 API. See When to use which below.

Why scalable topics

Classic partitioned topics scale by fixing a partition count when the topic is created. That model has three structural limits:

You must guess the right size upfront. The partition count is chosen before you know the real traffic, and it bounds the maximum consumer parallelism for the life of the topic.
You cannot scale down. A partitioned topic's partition count can be increased but never decreased, so a topic provisioned for a traffic spike stays oversized forever.
Resizing breaks key ordering. Because messages are routed by hash(key) % partitionCount, changing the partition count remaps keys to different partitions, breaking the per-key ordering guarantee.

Scalable topics remove all three limits: capacity tracks load automatically in both directions, and per-key ordering is preserved across every resize.

How it works

Internally, a scalable topic spreads the hash key space across a set of segments. Each segment owns a contiguous range of the key space and is backed by its own internal topic. Together the segments form a directed acyclic graph (DAG) that records how ranges have been split and merged over time.

Split. When a segment gets hot, the controller splits its key range in two and hands each half to a new child segment. This raises parallelism for exactly the part of the key space that is under load.
Merge. When adjacent segments go cold, the controller merges their ranges back into one, reclaiming capacity.
Range-based key routing. A message key is mapped to whichever segment currently owns its hash range. Because routing is by range rather than by hash(key) % N, a split or merge only ever moves a key from a parent range to a child range that still contains it -- so the ordering of any individual key is never broken.

This split/merge activity is driven by a per-topic controller in the broker and is automatic by default; you can also trigger splits and merges manually. The segment topology is managed entirely on the server and pushed to clients as it changes, so applications never see segments or have to react to resizing. To an application, a scalable topic is just a single stream.

Scalable vs. partitioned topics

	Partitioned topic (classic)	Scalable topic
Unit of parallelism	Fixed partition count, set at creation	Segments; count changes at runtime
Scale up	Increase partition count (manual)	Automatic split of hot segments
Scale down	Not possible	Automatic merge of cold segments
Key routing	`hash(key) % partitionCount`	Range-based hash assignment
Key ordering when resized	Broken (keys remap across partitions)	Preserved across split and merge
Topology visible to the app	Partition count and indexes are exposed	Opaque; managed by the broker
Client	Any Pulsar client	A client with V5 API support

Consumer model

Scalable topics use the V5 client API, which replaces the four classic subscription types (Exclusive, Failover, Shared, Key_Shared) with three purpose-built consumers:

Stream consumer -- ordered consumption with cumulative acknowledgment, for log-style processing where order matters. Replaces Exclusive and Failover.
Queue consumer -- parallel consumption with individual acknowledgment, negative acknowledgment, and dead-letter handling, for work-queue style fan-out. Replaces Shared and Key_Shared.
Checkpoint consumer -- no subscription and no acknowledgment; the application tracks its own position with a serializable checkpoint. Designed for stream-processing engines such as Flink and Spark that manage their own offsets.

Each consumer type is covered in detail in the V5 client documentation.

When to use which

New applications -- prefer scalable topics. You get automatic right-sizing and you never have to pick a partition count.
Existing applications -- keep using classic partitioned and non-partitioned topics. They are fully supported, and there is no requirement to migrate. When you are ready, an existing topic can be migrated in place to a scalable topic without copying data.
Clients without V5 support -- use classic topics. Scalable topics require a client SDK that supports the V5 API (see below).

Requirements

A V5-capable client SDK. Scalable topics are served through the V5 client API; clients using the classic API are rejected when they address a topic://. The Java V5 client supports scalable topics today, with the other language SDKs following.
A metadata store with streaming watch support. Oxia is recommended, because the controller pushes topology changes to clients through streaming watches that Oxia supports natively. If you are still on ZooKeeper, see Migrate metadata store.

Next steps

Configure metadata: Configure metadata store and Migrate metadata store from ZooKeeper to Oxia.
Learn the classic topic model for comparison: Messaging concepts -- Topics.

Why scalable topics​

How it works​

Scalable vs. partitioned topics​

Consumer model​

When to use which​

Requirements​

Next steps​