2.8.1

Note

In this section, it is assumed that you have enabled NFS service and set the shared path of your NFS service. In this section, /Users/test is used as the shared path of NFS service.

To offload data to NFS, follow these steps.

Step 1: Install the filesystem offloader

For details, see installation.

Step 2: Mont your NFS to your local filesystem

This example mounts mounts /Users/pulsar_nfs to /Users/test.

mount -e 192.168.0.103:/Users/test/Users/pulsar_nfs

Step 3: Configure the filesystem offloader driver

As indicated in the configuration section, you need to configure some properties for the filesystem offloader driver before using it. This tutorial assumes that you have configured the filesystem offloader driver as below and run Pulsar in standalone mode.

  1. Set the following configurations in the conf/standalone.conf file.
managedLedgerOffloadDriver=filesystem
fileSystemProfilePath=../conf/filesystem_offload_core_site.xml
  1. Modify the filesystem_offload_core_site.xml as follows.
<property>
<name>fs.defaultFS</name>
<value>file:///</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>file:///Users/pulsar_nfs</value>
</property>

<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>

<property>
<name>io.seqfile.compress.blocksize</name>
<value>1000000</value>
</property>
<property>

<name>io.seqfile.compression.type</name>
<value>BLOCK</value>
</property>

<property>
<name>io.map.index.interval</name>
<value>128</value>
</property>

Step 4: Offload data from BookKeeper to filesystem

Execute the following commands in the repository where you download Pulsar tarball. For example, ~/path/to/apache-pulsar-2.5.1.

  1. Start Pulsar standalone.

    ./bin/pulsar standalone -a 127.0.0.1
    
  2. To ensure the data generated is not deleted immediately, it is recommended to set the retention policy, which can be either a size limit or a time limit. The larger value you set for the retention policy, the longer the data can be retained.

    ./bin/pulsarctl namespaces set-retention public/default --size 100M --time 2d
    

    Tip

    For more information about the pulsarctl namespaces set-retention options command, including flags, descriptions, default values, and shorthands, see here.

  3. Produce data using pulsar-client.

    ./bin/pulsar-client produce -m "Hello FileSystem Offloader" -n 1000 public/default/fs-test
    
  4. The offloading operation starts after a ledger rollover is triggered. To ensure offload data successfully, it is recommended that you wait until several ledger rollovers are triggered. In this case, you might need to wait for a second. You can check the ledger status using pulsarctl.

    ./bin/pulsarctl topics internal-stats public/default/fs-test
    

    Output

    The data of the ledger 696 is not offloaded.

    {
    "version": 1,
    "creationDate": "2020-06-16T21:46:25.807+08:00",
    "modificationDate": "2020-06-16T21:46:25.821+08:00",
    "ledgers": [
    {
        "ledgerId": 696,
        "isOffloaded": false
    }
    ],
    "cursors": {}
    }
    
  5. Wait a second and send more messages to the topic.

    ./bin/pulsar-client produce -m "Hello FileSystem Offloader" -n 1000 public/default/fs-test
    
  6. Check the ledger status using pulsarctl.

    ./bin/pulsarctl topics internal-stats public/default/fs-test
    

    Output

    The ledger 696 is rollovered.

    {
    "version": 2,
    "creationDate": "2020-06-16T21:46:25.807+08:00",
    "modificationDate": "2020-06-16T21:48:52.288+08:00",
    "ledgers": [
    {
        "ledgerId": 696,
        "entries": 1001,
        "size": 81695,
        "isOffloaded": false
    },
    {
        "ledgerId": 697,
        "isOffloaded": false
    }
    ],
    "cursors": {}
    }
    
  7. Trigger the offloading operation manually using pulsarctl.

    ./bin/pulsarctl topic offload -s 0 public/default/fs-test
    

    Output

    Data in ledgers before the ledge 697 is offloaded.

    # offload info, the ledgers before 697 will be offloaded
    Offload triggered for persistent://public/default/fs-test3 for messages before 697:0:-1
    
  8. Check the ledger status using pulsarctl.

    ./bin/pulsarctl topic internal-info public/default/fs-test
    

    Output

    The data of the ledger 696 is offloaded.

    {
    "version": 4,
    "creationDate": "2020-06-16T21:46:25.807+08:00",
    "modificationDate": "2020-06-16T21:52:13.25+08:00",
    "ledgers": [
    {
        "ledgerId": 696,
        "entries": 1001,
        "size": 81695,
        "isOffloaded": true
    },
    {
        "ledgerId": 697,
        "isOffloaded": false
    }
    ],
    "cursors": {}
    }
    

    And the Capacity Used is changed from 4 KB to 116.46 KB.

GCS offloaderAzure BlobStore offloader