Skip to main content
Version: 5.0.x

Deploy a cluster on Docker

To deploy a Pulsar cluster on Docker using Docker commands, you need to complete the following steps.

Step 1: Pull a Pulsar image

To run Pulsar on Docker, you need to create a container for each Pulsar component: the metadata store, a bookie, and a broker. This tutorial uses Oxia as the metadata store (the recommended option for new clusters), which runs from its own image; the bookie and broker run from the Pulsar image. Oxia standalone serves a single default namespace, which both Pulsar and BookKeeper use here; production clusters use separate Oxia namespaces (see Configure metadata store).

You can pull a Pulsar image from Docker Hub with the following command. If you do not want to use some connectors, you can use apachepulsar/pulsar:latest there.

docker pull apachepulsar/pulsar-all:latest

Step 2: Create a network

To deploy a Pulsar cluster on Docker, you need to create a network and connect the containers of Oxia, bookie, and broker to this network. Use the following command to create the network pulsar:

docker network create pulsar

Step 3: Create and start containers

Create an Oxia container

Create an Oxia container and start the Oxia metadata store.

docker run -d -p 6648:6648 --net=pulsar \
-v $(pwd)/data/oxia:/data \
--name oxia --hostname oxia \
oxia/oxia:latest \
oxia standalone

Initialize the cluster metadata

After creating the Oxia container successfully, you can use the following command to initialize the cluster metadata.

docker run --net=pulsar \
--name initialize-pulsar-cluster-metadata \
apachepulsar/pulsar-all:latest bash -c "bin/pulsar initialize-cluster-metadata \
--cluster cluster-a \
--metadata-store oxia://oxia:6648/default \
--configuration-store oxia://oxia:6648/default \
--web-service-url http://broker:8080 \
--broker-service-url pulsar://broker:6650"

Create a bookie container

Create a bookie container and start the bookie service.

docker run -d -e clusterName=cluster-a --net=pulsar \
-e metadataServiceUri=metadata-store:oxia://oxia:6648/default \
-v $(pwd)/data/bookkeeper:/pulsar/data/bookkeeper \
--name bookie --hostname bookie \
apachepulsar/pulsar-all:latest \
bash -c "bin/apply-config-from-env.py conf/bookkeeper.conf && exec bin/pulsar bookie"

Create a broker container

Create a broker container and start the broker service.

docker run -d -p 6650:6650 -p 8080:8080 --net=pulsar \
-e metadataStoreUrl=oxia://oxia:6648/default \
-e clusterName=cluster-a \
--name broker --hostname broker \
apachepulsar/pulsar-all:latest \
bash -c "bin/apply-config-from-env.py conf/broker.conf && exec bin/pulsar broker"

Step 4: Configuration overview

Pulsar Docker images support the following configuration categories:

  • JVM Configuration: Controls JVM memory allocation and garbage collection for Broker and BookKeeper processes. In Docker, JVM parameters are set via environment variables such as PULSAR_MEM and BOOKIE_MEM.
  • Broker Configuration (broker.conf): Core runtime parameters for the Pulsar Broker, including metadata store connection, cluster name, ports, and message replication settings.
  • BookKeeper Configuration (bookkeeper.conf): Storage engine parameters for BookKeeper Bookies, including journal and ledger directories, compaction, and disk usage thresholds.
  • Log4j Configuration (log4j2.yaml): Logging framework settings including log levels, output format, and file rolling strategies.
  • Dynamic Configuration: Some Broker configuration properties can be updated at runtime without restarting the container, using the pulsar-admin CLI tool or the Admin REST API.

For a complete list of all available configuration properties, see the Pulsar Configuration Reference.

How Docker configuration works

Pulsar Docker images include a Python script called apply-config-from-env.py that runs before the main process starts. This script reads all environment variables and maps them directly to configuration file properties:

  1. If an environment variable name matches a key in the built-in configuration file shipped with the container (e.g., broker.conf or bookkeeper.conf), the script updates that key's value.
  2. Environment variables prefixed with PULSAR_PREFIX_ are also supported — the prefix is stripped and the remaining name is used as the configuration key. This is useful when the configuration key conflicts with existing system environment variables. Using PULSAR_PREFIX_ is necessary for configuration keys that aren't available in the shipped configuration files, but are supported by the component (for example, keys available in Pulsar's ServiceConfiguration).

For example, setting -e managedLedgerDefaultEnsembleSize=2 will update the managedLedgerDefaultEnsembleSize property in the target configuration file.

Configuration methods

Method 1: Using -e environment variables

Pass configuration properties directly as environment variables in the docker run command:

docker run -d \
-e metadataStoreUrl=oxia://oxia:6648/default \
-e clusterName=cluster-a \
-e managedLedgerDefaultEnsembleSize=2 \
-e managedLedgerDefaultWriteQuorum=2 \
-e managedLedgerDefaultAckQuorum=2 \
apachepulsar/pulsar-all:latest \
bash -c "bin/apply-config-from-env.py conf/broker.conf && exec bin/pulsar broker"

Method 2: Using --env-file for batch loading

For a large number of configuration properties, use an environment file to keep your docker run command clean:

docker run -d --env-file ./broker-config.env \
apachepulsar/pulsar-all:latest \
bash -c "bin/apply-config-from-env.py conf/broker.conf && exec bin/pulsar broker"

Example broker-config.env file:

metadataStoreUrl=oxia://oxia:6648/default
clusterName=cluster-a
managedLedgerDefaultEnsembleSize=2
managedLedgerDefaultWriteQuorum=2
managedLedgerDefaultAckQuorum=2
PULSAR_MEM=-Xms4g -Xmx4g -XX:MaxDirectMemorySize=8g

Method 3: Using Docker Volume to mount custom configuration files

You can mount a custom configuration file from the host into the container, bypassing the environment variable mechanism entirely:

docker run -d \
-e PULSAR_MEM="-Xms4g -Xmx4g -XX:MaxDirectMemorySize=8g" \
-v $(pwd)/my-broker.conf:/pulsar/conf/broker.conf \
apachepulsar/pulsar-all:latest \
bin/pulsar broker

Common configuration examples

Below are examples of commonly used configuration properties for BookKeeper and Broker containers. You can add more -e flags to the docker run command shown in Step 3 to customize the behavior.

BookKeeper

docker run -d -e clusterName=cluster-a --net=pulsar \
-e metadataServiceUri=metadata-store:oxia://oxia:6648/default \
# Storage directories: journal for write-ahead logs, ledgers for actual message data
-e journalDirectory=/pulsar/data/bookkeeper/journal \
-e ledgerDirectories=/pulsar/data/bookkeeper/ledgers \
# Disk usage thresholds: bookie will reject writes when usage exceeds these limits
-e diskUsageThreshold=0.95 \
-e diskUsageWarnThreshold=0.90 \
-e diskUsageLwmThreshold=0.87 \
# GC and Compaction: reclaim disk space by removing unused ledger data
-e gcWaitTime=900000 \
-e minorCompactionThreshold=0.2 \
-e minorCompactionInterval=3600 \
-e majorCompactionThreshold=0.5 \
-e majorCompactionInterval=86400 \
# JVM memory
-e BOOKIE_MEM="-Xms4g -Xmx4g -XX:MaxDirectMemorySize=4g" \
# Extra JVM options: appended to JVM flags in the startup script, can override default JVM parameters
-e BOOKIE_EXTRA_OPTS="-XX:+ExitOnOutOfMemoryError" \
# Volume mounts: if possible, use separate physical disks for journal and ledger to improve read/write performance
-v $(pwd)/data/bookkeeper/journal:/pulsar/data/bookkeeper/journal \
-v $(pwd)/data/bookkeeper/ledgers:/pulsar/data/bookkeeper/ledgers \
--name bookie --hostname bookie \
apachepulsar/pulsar-all:latest \
bash -c "bin/apply-config-from-env.py conf/bookkeeper.conf && exec bin/pulsar bookie"

Broker

docker run -d -p 6650:6650 -p 8080:8080 --net=pulsar \
-e metadataStoreUrl=oxia://oxia:6648/default \
-e clusterName=cluster-a \
# Ensemble settings: control how messages are replicated across bookies (must not exceed the number of deployed bookies)
-e managedLedgerDefaultEnsembleSize=2 \
-e managedLedgerDefaultWriteQuorum=2 \
-e managedLedgerDefaultAckQuorum=1 \
# Ports: binary protocol port and HTTP admin port
-e brokerServicePort=6650 \
-e webServicePort=8080 \
# JVM memory
-e PULSAR_MEM="-Xms4g -Xmx4g -XX:MaxDirectMemorySize=8g" \
# Extra JVM options: appended to JVM flags in the startup script, can override default JVM parameters
-e PULSAR_EXTRA_OPTS="-Dio.netty.allocator.maxOrder=13 -Dio.netty.allocator.numDirectArenas=8 -Dio.netty.allocator.maxCachedBufferCapacity=1048576" \
--name broker --hostname broker \
apachepulsar/pulsar-all:latest \
bash -c "bin/apply-config-from-env.py conf/broker.conf && exec bin/pulsar broker"
tip

You can also refer to the default configuration in the Pulsar Helm Chart values.yaml as a tuning reference.