Skip to main content

· 6 min read

This August, we concluded Pulsar Summit SF, our first-ever, in-person event in North America. It witnessed over 12 breakout sessions and 5 keynotes with over 200 attendees from Apple, Blizzard, IBM, Optum, Iterable, Twitter, Uber, and many more. As we could see from the conference, there is an increase in the adoption of Pulsar and growing interest in messaging and streaming. And now, we are excited to invite you to Pulsar Summit Aisa 2022 to explore the latest messaging and streaming technologies!

Held on November 19th and 20th, this two-day virtual event will feature 36 sessions by developers, engineers, architects, and technologists from ByteDance, Huawei, Tencent, Nippon Telegraph and Telephone Corporation (NTT) Software Innovation Center, Yum China, Netease, vivo, WeChat, Nutanix, StreamNative, and many more. It will include sessions on Pulsar use cases, its ecosystem, operations, and technology deep dives.

· 3 min read

The Apache Pulsar community releases version 2.7.5! 23 contributors provided improvements and bug fixes that delivered 89 commits. Thanks for all your contributions.

The highlight of the 2.7.5 release is that it fixes some critical bugs on broker, proxy, and storage, including message/data loss, broker deadlock, and connection leak. Note that 2.7.5 is the last release of 2.7.x.

This blog walks through the most noteworthy changes. For the complete list, including all feature enhancements and bug fixes, check out the Pulsar 2.7.5 Release Notes.

Fixed the deadlock on metadata cache missing while checking replications. PR-16889

Issue

After the changes in #12340, there are still a couple of places making blocking calls. These calls occupy all the ordered scheduler threads preventing the callbacks from completing until the 30 seconds timeout expires.

Resolution

Change the blocking calls to async mode on the metadata callback thread.

Fixed the deadlock when using the key_shared mode. PR-11965

Issue

When the key_shared mode is used in consumers, deadlock may happen in the broker due to some race conditions and result in a lot of CLOSE_WAIT status connections.

Resolution

Change unlock before the callback in the asyncDelete function of ManagedCursorImpl.

Fixed the message loss issue due to ledger rollover. PR-14703

Issue

If users config managedLedgerMaxLedgerRolloverTimeMinutes > 0, and the rollover happens when the ManagedLedger state is CreatingLedger, the messages written during that time are lost.

Resolution

Rollover only when the ledger state is LedgerOpened.

Fixed the port exhaustion and connection issues in Pulsar Proxy. PR-14078

Issue

Pulsar Proxy can get into a state where it stops proxying Broker connections while Admin API proxying keeps working.

Resolution

Optimize the proxy connection to fail-fast when the target broker isn't active. Fix the race conditions in Pulsar Proxy when establishing a connection, leading to invalid states and hanging connections. Add connection timeout handling to proxy connections. Add read timeout handling to incoming connections and proxied connections.

Fixed the compaction data loss due to missed compaction properties during cursor reset. PR-16404

Issue

The compaction reader seeks the earliest position to read data from the topic, but the compaction properties are missed during cursor reset, which leads to the initialized compaction subscribe without a compaction horizon, so the compaction reader skips the last compacted data. It only happens when initializing the compaction subscription and can be introduced by the load balance or topic unloading manually.

Resolution

Keep the properties for resetting the cursor while the cursor is for data compaction. Copy the properties to the new mark delete entry while advancing the cursor, which is triggered by the managed ledger internal. It's not only for the compacted topic, and the internal task should not lose the properties when trimming the cursor.

What’s Next?

If you are interested in learning more about Pulsar 2.7.5, you can download and try it out now!

For more information about the Apache Pulsar project and current progress, visit the Pulsar website, follow the project on Twitter @apache_pulsar, and join Pulsar Slack!

· 4 min read

Pulsar Summit is the conference dedicated to Apache Pulsar, and the messaging and event streaming community. The conference gathers an international audience of developers, data architects, data scientists, Apache Pulsar committers and contributors, as well as the messaging and streaming community. Together, they share experiences, exchange ideas and knowledge, and receive hands-on training sessions led by Pulsar experts.

In January this year, Pulsar Summit Asia 2021 (delayed due to the COVID-19 pandemic) featured more than 25 interactive sessions by technologists, developers, software engineers, and software architects from StreamNative, BIGO, China Mobile, Nutanix, Tencent, and more. The conference drew over 1000 attendees around the world, including attendees from top technology, internet, and media companies, such as Tencent, TikTok, Alibaba, and Microsoft.

Pulsar Summit Asia 2022 will be hosted virtually on November 19th and 20th, 2022. It is expected to cover the pivotal topics and technologies at the core of Apache Pulsar.

· 4 min read

The Apache Pulsar community releases version 2.9.3! 50 contributors provided improvements and bug fixes that delivered 200+ commits. Thanks for all your contributions.

The highlight of the 2.9.3 release is introducing 30+ transaction fixes and improvements. Earlier-adoption users of Pulsar transactions have documented long-term use in their production environments and reported valuable findings in real applications. This provides the Pulsar community with the opportunity to make a difference.

This blog walks through the most noteworthy changes. For the complete list including all feature enhancements and bug fixes, check out the Pulsar 2.9.3 Release Notes.

Enabled cursor data compression to reduce persistent cursor data size. 14542

Issue

The cursor data is managed by the ZooKeeper/Etcd metadata store. When the data size increases, it may take too much time to pull the data, and brokers may end up writing large chunks of data to the ZooKeeper/Etcd metadata store.

Resolution

Provide the ability to enable compression mechanisms to reduce cursor data size and the pulling time.

Reduced the memory occupied by metadataPositions and avoid OOM. 15137

Issue

The map metadataPositions in MLPendingAckStore is used to clear useless data in PendingAck, where the key is the position that is persistent in PendingAck and the value is the max position acked by an operation. It judges whether the max subscription cursor position is smaller than the subscription cursor’s markDeletePosition. If the max position is smaller, then the log cursor will mark to delete the position. It causes two main issues:

  • In normal cases, this map stores all transaction ack operations. This is a waste of memory and CPU.
  • If a transaction that has not been committed for a long time acks a message in a later position, the map will not be cleaned up, which finally leads to OOM (out-of-memory).

Resolution

Regularly store a small amount of data according to certain rules. For more detailed implementation, refer to PIP-153.

Checked lowWaterMark before appending transaction entries to Transaction Buffer. 15424

Issue

When a client sends messages using a previously committed transaction, these messages are visible to consumers unexpectedly.

Resolution

Add a map to store the lowWaterMark of Transaction Coordinator in Trasanction Buffer, and check lowWaterMark before appending transaction entries to Trasanction Buffer. So when sending messages using an invalid transaction, clients will receive NotAllowedException.

Fixed the consumption performance regression. PR-15162

Issue

This performance regression was introduced in 2.10.0, 2.9.1, and 2.8.3. You may find a significant performance drop with message listeners while using Java Client. The root cause is each message will introduce the thread switching from the external thread pool to the internal thread poll and then to the external thread pool.

Resolution

Avoid the thread switching for each message to improve consumption throughput.

Fixed a deadlock issue of topic creation. PR-15570

Issue

This deadlock issue occurred during topic creation by trying to re-acquire the same StampedLock from the same thread when removing it. This will cause the topic to stop service for a long time, and ultimately with a failure in the deduplication or geo-replication check. The workaround is restarting the broker.

Optimized the memory usage of brokers.

Issue

Pulsar has some internal data structures, such as ConcurrentLongLongPairHashMap, and ConcurrentLongPairHashMap, which can reduce the memory usage rather than using the Boxing type. However, in earlier versions, the data structures were not supported for shrinking even if the data was removed, which wasted a certain amount of memory in certain situations.

Pull requests

Resolution

Support the shrinking of the internal data structures, such as ConcurrentSortedLongPairSet, ConcurrentOpenHashMap, and so on.

What’s Next?

If you are interested in learning more about Pulsar 2.9.3, you can download and try it out now!

Pulsar Summit San Francisco 2022 will take place on August 18th, 2022. Register now and help us make it an even bigger success by spreading the word on social media!

For more information about the Apache Pulsar project and current progress, visit the Pulsar website, follow the project on Twitter @apache_pulsar, and join Pulsar Slack!

· 5 min read

We are excited to invite you to ApacheCon Asia 2022 to explore the newest tools and tips, and connect with subject-matter experts in various Apache Pulsar-related sessions. The Apache Software Foundation will be holding ApacheCon Asia 2022 online between July 29th and July 31st, 2022. Register now for free to join us for this inspiring three-day event of cutting-edge technologies.

· 4 min read

The Apache Pulsar community releases version 2.10.1! 50 contributors provided improvements and bug fixes that delivered 200+ commits. Thanks for all your contributions.

The highlight of the 2.10.1 release is introducing 30+ transaction fixes and improvements. Earlier-adoption users of Pulsar transactions have documented long-term use in their production environments and reported valuable findings in real applications. This provides the Pulsar community with the opportunity to make a difference.

This blog walks through the most noteworthy changes. For the complete list including all feature enhancements and bug fixes, check out the Pulsar 2.10.1 Release Notes.

Fixed ineffective load manager due to broker’s zero resource usage. PR-15314

Issue

Introduced in 2.10.0, the leader broker’s resource usage (CPU, memory, direct memory…) was always 0 when performing load balance. The root cause is that deserializing the JSON data to ResourceUsage POJO didn’t use the constructor ResourceUsage (double usage, double limit), so the percentage was always 0.

Allow users with produce/consume privileges to get topic schema. PR-15956

Issue

In earlier versions, only users with admin privileges were able to get topic schema, which made schema inconvenient to use.

Resolution

Allow users who have metadata access privileges to get topic schema. Subscribers can be from different teams, and the producers and subscribers should be able to get the topic schema instead of asking tenant admin to do so before publishing and consuming messages.

Fixed the consumption performance regression. PR-15162

Issue

This performance regression was introduced in 2.10.0, 2.9.1, and 2.8.3. You may find a significant performance drop with message listeners while using Java Client. The root cause is each message will introduce the thread switching from the external thread pool to the internal thread poll, and then to the external thread pool.

Resolution

2.10.1 is the first version to have this issue fixed by avoiding the thread switching for each message to improve consumption throughput.

Fixed a deadlock issue of topic creation. PR-15570

Issue

This deadlock issue occurred during topic creation by trying to re-acquire the same StampedLock from the same thread when removing it. This will cause the topic to stop service for a long time, and ultimately with a failure in the deduplication or geo-replication check. The workaround is restarting the broker.

Fixed key-shared delivery of messages with interleaved delays. PR-15409

Issue

This is a regression issue introduced in 2.10.0. When delayed messages with interleaved delays occurred on a shared/key-shared subscription, many of the messages were not delivered but stayed in the backlog. The reason was that when peeking into getMessagesToReplayNow(), we could not discard the returned set due to untracked message IDs in the delayed message controller.

Optimized the memory usage of brokers.

Issue

Pulsar has some internal data structures, such as ConcurrentLongLongPairHashMap, and ConcurrentLongPairHashMap, which can reduce the memory usage rather than using the Boxing type. However, in earlier versions, the data structures were not supported for shrinking even if the data was removed, which wasted a certain amount of memory in certain situations.

Pull requests

Resolution

Support the shrinking of the internal data structures, such as ConcurrentSortedLongPairSet, ConcurrentOpenHashMap, and so on.

What’s Next?

If you are interested in learning more about Pulsar 2.10.1, you can download and try it out now!

Pulsar Summit San Francisco 2022 will take place on August 18th, 2022. Register now and help us make it an even bigger success by spreading the word on social media!

For more information about the Apache Pulsar project and current progress, visit the Pulsar website, follow the project on Twitter @apache_pulsar, and join Pulsar Slack!

· 6 min read

The Apache Pulsar community releases version 2.10. 99 contributors provided improvements and bug fixes that delivered over 800 commits.

· 2 min read

Apache Pulsar is one of the fastest growing, most engaged open source projects, recognized by the Apache Software Foundation as a Top 5 Project based on engagement in 2021. The vitality of any open source project relies on continued community growth and engagement, and this month, the Apache Pulsar community hit another major milestone: welcoming its 500th contributor!

· 4 min read

The Apache Pulsar community releases version 2.9.2! 60 contributors provided improvements and bug fixes that delivered 317 commits.