除了最极端的情况，对于绝大多数的用例来说，单机群安装的Pulsar就能够满足要求了。 If you are interested in experimenting with Pulsar or using Pulsar in a startup or on a single team, you had better opt for a single cluster. If you do need to run a multi-cluster Pulsar instance, see the guide here.
If you want to use all builtin Pulsar IO connectors in your Pulsar deployment, you need to download
apache-pulsar-io-connectorspackage and install
connectorsdirectory in the pulsar directory on every broker node or on every function-worker node if you have run a separate cluster of function workers for Pulsar Functions.
If you want to use Tiered Storage feature in your Pulsar deployment, you need to download
apache-pulsar-offloaderspackage and install
offloadersdirectory in the pulsar directory on every broker node. For more details of how to configure this feature, you can refer to the Tiered storage cookbook.
- Deploy a ZooKeeper cluster (optional)
- Initialize cluster metadata
- Deploy a BookKeeper cluster
- Deploy one or more Pulsar brokers
If you already have an existing zookeeper cluster and want to reuse it, you do not need to prepare the machines for running ZooKeeper.
To run Pulsar on bare metal, you had better have the following:
- 一个 包含Pulsar所有broker主机的DNS（域名解析服务器）
If you do not have enough machines, or try out Pulsar in cluster mode (and expand the cluster later), you can even deploy Pulsar in one node, where Zookeeper, bookie and broker are run in the same machine.
If you do not have a DNS server, you can use multi-host in service URL instead.
Each machine in your cluster needs to have Java 8 or higher version of Java installed.
The following is a diagram showing the basic setup:
In this diagram, connecting clients need to be able to communicate with the Pulsar cluster using a single URL, in this case
pulsar-cluster.acme.com abstracts over all of the message-handling brokers. Pulsar消息brokers与BookKeeper bookies一起运行；反过来，brokers 和 bookies 都依赖 ZooKeeper。
When you deploy a Pulsar cluster, keep in mind the following basic better choices when you do the capacity planning.
For machines running ZooKeeper, you had better use lighter-weight machines or VMs. Pulsar uses ZooKeeper only for periodic coordination-related and configuration-related tasks, not for basic operations. If you run Pulsar on Amazon Web Services (AWS), for example, a t2.small instance might likely suffice.
Bookies and Brokers
For machines running a bookie and a Pulsar broker, you had better use more powerful machines. 例如，对于AWS部署，i3.4xlarge实例可能是合适的。 On those machines you can use the following:
- 高性能的CPU和10Gbps NIC （适用于Pulsar brokers）
- 小型快速固态硬盘（SSD）或硬盘驱动器（HDD），带有RAID控制器和电池供电的写缓存（适用于BookKeeper bookies）
Install the Pulsar binary package
To get started deploying a Pulsar cluster on bare metal, you need to download a binary tarball release in one of the following ways:
- By clicking on the link below directly, which automatically triggers a download:
- 从Pulsar的 下载页面下载
- 从Pulsar在 GitHub 的 发布页面
- 使用 wget 命令下载:
$ wget https://archive.apache.org/dist/pulsar/pulsar-2.5.0/apache-pulsar-2.5.0-bin.tar.gz
Once you download the tarball, untar it and
cd into the resulting directory:
$ tar xvzf apache-pulsar-2.5.0-bin.tar.gz $ cd apache-pulsar-2.5.0
|command-line tools of Pulsar, such as |
|Pulsar的配置文件，包含broker配置,ZooKeeper 配置 等等|
|The data storage directory that ZooKeeper and BookKeeper use|
|The JAR files that Pulsar uses|
|Logs that the installation creates|
Since Pulsar releases
2.1.0-incubating, Pulsar releases a separate binary distribution, containing all the
builtinconnectors. If you want to enable those
builtinconnectors, you can follow the instructions as below; otherwise you can skip this section for now.
To get started using builtin connectors, you need to download the connectors tarball release on every broker node in one of the following ways:
通过单击下面的链接并从 Apache 镜像下载该版本:
从 Pulsar 发布页面
Once you download the nar file, copy the file to directory
connectors in the pulsar directory, for example, if you download the connector file
$ mkdir connectors $ mv pulsar-io-aerospike-2.5.0.nar connectors $ ls connectors pulsar-io-aerospike-2.5.0.nar ...
Since Pulsar release
2.2.0, Pulsar releases a separate binary distribution, containing the tiered storage offloaders. If you want to enable tiered storage feature, you can follow the instructions as below; otherwise you can skip this section for now.
To get started using tiered storage offloaders, you need to download the offloaders tarball release on every broker node in one of the following ways:
通过单击下面的链接并从 Apache 镜像下载该版本:
从 Pulsar 发布页面
Once you download the tarball, in the pulsar directory, untar the offloaders package and copy the offloaders as
offloaders in the pulsar directory:
$ tar xvfz apache-pulsar-offloaders-2.5.0-bin.tar.gz // you can find a directory named `apache-pulsar-offloaders-2.5.0` in the pulsar directory // then copy the offloaders $ mv apache-pulsar-offloaders-2.5.0/offloaders offloaders $ ls offloaders tiered-storage-jcloud-2.5.0.nar
For more details of how to configure tiered storage feature, you can refer to the Tiered storage cookbook
Deploy a ZooKeeper cluster
If you already have an exsiting zookeeper cluster and want to use it, you can skip this section.
ZooKeeper manages a variety of essential coordination- and configuration-related tasks for Pulsar. To deploy a Pulsar cluster you need to deploy ZooKeeper first (before all other components). You had better deploy a 3-node ZooKeeper cluster. Pulsar does not make heavy use of ZooKeeper, so more lightweight machines or VMs should suffice for running ZooKeeper.
zk1.us-west.example.com:2888:3888 =zk2.us-west.example.com:2888:3888 =zk3.us-west.example.com:2888:3888=
On each host, you need to specify the ID of the node in the
myid file of each node, which is in each
data/zookeeper folder of server by default (you can change the file location via the
See the Multi-server setup guide in the ZooKeeper documentation for detailed information on
On a ZooKeeper server at
zk1.us-west.example.com, for example, you can set the
myid value like this:
$ mkdir -p data/zookeeper $ echo 1 > data/zookeeper/myid
zk2.us-west.example.com the command is
echo 2 > data/zookeeper/myid and so on.
Once you add each server to the
zookeeper.conf configuration and have the appropriate
myid entry, you can start ZooKeeper on all hosts (in the background, using nohup) with the
pulsar-daemon CLI tool:
$ bin/pulsar-daemon start zookeeper
If you plan to deploy zookeeper with bookie on the same node, you need to start zookeeper by using different stats port.
$ PULSAR_EXTRA_OPTS="-Dstats_server_port=8001" bin/pulsar-daemon start zookeeper
Once you deploy ZooKeeper for your cluster, you need to write some metadata to ZooKeeper for each cluster in your instance. You only need to write once.
bin/pulsar initialize-cluster-metadata \ --cluster pulsar-cluster-1 \ --zookeeper zk1.us-west.example.com:2181 \ --configuration-store zk1.us-west.example.com:2181 \ --web-service-url http://pulsar.us-west.example.com:8080 \ --web-service-url-tls https://pulsar.us-west.example.com:8443 \ --broker-service-url pulsar://pulsar.us-west.example.com:6650 \ --broker-service-url-tls pulsar+ssl://pulsar.us-west.example.com:6651
As you can see from the example above, you need to specify the following:
|A "local" ZooKeeper connection string for the cluster. This connection string only needs to include one machine in the ZooKeeper cluster.|
|整个集群实例的配置存储连接字符串。 As with the |
|The web service URL for the cluster, plus a port. This URL should be a standard DNS name. The default port is 8080 (you had better not use a different port).|
|If you use TLS, you also need to specify a TLS web service URL for the cluster. The default port is 8443 (you had better not use a different port).|
|Broker服务的URL，用于与集群中的brokers进行交互。 This URL should not use the same DNS name as the web service URL but should use the |
|If you use TLS, you also need to specify a TLS web service URL for the cluster as well as a TLS broker service URL for the brokers in the cluster. The default port is 6651 (you had better not use a different port).|
If you don't have a DNS server, you can use multi-host in service URL with the following settings:
properties --web-service-url http://host1:8080,host2:8080,host3:8080 \ --web-service-url-tls https://host1:8443,host2:8443,host3:8443 \ --broker-service-url pulsar://host1:6650,host2:6650,host3:6650 \ --broker-service-url-tls pulsar+ssl://host1:6651,host2:6651,host3:6651
Deploy a BookKeeper cluster
BookKeeper handles all persistent data storage in Pulsar. You need to deploy a cluster of BookKeeper bookies to use Pulsar. You can choose to run a 3-bookie BookKeeper cluster.
You can configure BookKeeper bookies using the
conf/bookkeeper.conf configuration file. The most important step in configuring bookies for our purposes here is ensuring that the
zkServers is set to the connection string for the ZooKeeper cluster. The following is an example:
Once you appropriately modify the
zkServers parameter, you can provide any other configuration modifications you need. You can find a full listing of the available BookKeeper configuration parameters here, although consulting the BookKeeper documentation for a more in-depth guide might be a better choice.
Since Pulsar 2.1.0 releases, Pulsar introduces stateful function for Pulsar Functions. If you want to enable that feature, you need to enable table service on BookKeeper by doing the following setting in
Once you apply the desired configuration in
conf/bookkeeper.conf, you can start up a bookie on each of your BookKeeper hosts. You can start up each bookie either in the background, using nohup, or in the foreground.
To start the bookie in the background, use the
pulsar-daemon CLI tool:
$ bin/pulsar-daemon start bookie
To start the bookie in the foreground:
$ bin/bookkeeper bookie
You can verify that a bookie works properly by running the
bookiesanity command for the BookKeeper shell on it:
$ bin/bookkeeper shell bookiesanity
This command creates an ephemeral BookKeeper ledger on the local bookie, writes a few entries, reads them back, and finally deletes the ledger.
After you start all the bookies, you can use
simpletest command for BookKeeper shell on any bookie node, to verify all the bookies in the cluster are up running.
$ bin/bookkeeper shell simpletest --ensemble <num-bookies> --writeQuorum <num-bookies> --ackQuorum <num-bookies> --numEntries <num-entries>
This command creates a
num-bookies sized ledger on the cluster, writes a few entries, and finally deletes the ledger.
Deploy Pulsar brokers
Pulsar brokers are the last thing you need to deploy in your Pulsar cluster. Brokers handle Pulsar messages and provide the administrative interface of Pulsar. A good choice is to run 3 brokers, one for each machine that already runs a BookKeeper bookie.
The most important element of broker configuration is ensuring that each broker is aware of the ZooKeeper cluster that you have deployed. Make sure that the
configurationStoreServers parameters. In this case, since you only have 1 cluster and no configuration store setup, the
configurationStoreServers point to the same
You also need to specify the cluster name (matching the name that you provide when you initialize the metadata of the cluster):
In addition, you need to match the broker and web service ports provided when you initialize the metadata of the cluster (especially when you use a different port from default):
brokerServicePort=6650 brokerServicePortTls=6651 webServicePort=8080 webServicePortTls=8443
# Number of bookies to use when creating a ledger managedLedgerDefaultEnsembleSize=1 # Number of copies to store for each message managedLedgerDefaultWriteQuorum=1 # Number of guaranteed copies (acks to wait before write is complete) managedLedgerDefaultAckQuorum=1 ```
Enable Pulsar Functions (optional)
If you want to enable Pulsar Functions, you can follow the instructions as below:
conf/broker.confto enable functions worker, by setting
pulsarFunctionsClusterto the cluster name that you provide when you initialize the metadata of the cluster.
If you want to learn more options about deploying functions worker, checkout Deploy and manage functions worker.
You can then provide any other configuration changes that you want in the
conf/broker.conf file. Once you decide on a configuration, you can start up the brokers for your Pulsar cluster. Like ZooKeeper and BookKeeper, you can start brokers either in the foreground or in the background, using nohup.
You can start a broker in the foreground using the
pulsar broker command:
$ bin/pulsar broker
You can start a broker in the background using the
pulsar-daemon CLI tool:
$ bin/pulsar-daemon start broker
Once you succesfully start up all the brokers that you intend to use, your Pulsar cluster should be ready to go!
Connect to the running cluster
Once your Pulsar cluster is up and running, you should be able to connect with it using Pulsar clients. One such client is the
pulsar-client tool, which is included with the Pulsar binary package. The
pulsar-client tool can publish messages to and consume messages from Pulsar topics and thus provide a simple way to make sure that your cluster runs properly.
To use the
pulsar-client tool, first modify the client configuration file in
conf/client.conf in your binary package. You need to change the values for
localhost (which is the default), with the DNS name that you assign to your broker/bookie hosts. The following is an example:
If you don't have a DNS server, you can specify multi-host in service URL like below:
properties webServiceUrl=http://host1:8080,host2:8080,host3:8080 brokerServiceurl=pulsar://host1:6650,host2:6650,host3:6650
Once you do that, you can publish a message to Pulsar topic:
$ bin/pulsar-client produce \ persistent://public/default/test \ -n 1 \ -m "Hello Pulsar"
You may need to use a different cluster name in the topic if you specify a cluster name different from
This command publishes a single message to the Pulsar topic. In addition, you can subscribe the Pulsar topic in a different terminal before publishing messages as below:
$ bin/pulsar-client consume \ persistent://public/default/test \ -n 100 \ -s "consumer-test" \ -t "Exclusive"
Once you successfully publish the message above to the topic, you should see it in the standard output:
----- 收到消息 ----- Hello Pulsar
If you have enabled Pulsar Functions, you can also tryout pulsar functions now.
Create a ExclamationFunction
bin/pulsar-admin functions create \ --jar examples/api-examples.jar \ --classname org.apache.pulsar.functions.api.examples.ExclamationFunction \ --inputs persistent://public/default/exclamation-input \ --output persistent://public/default/exclamation-output \ --tenant public \ --namespace default \ --name exclamation
Check if the function runs as expected by triggering the function.
bin/pulsar-admin functions trigger --name exclamation --trigger-value "hello world"
You can see the output as below: