펄사는 게시-구독 패턴(줄여서 pub-sub)을 기반으로 한다. 이 패턴에서 게시자는 주제에 대해 메시지를 게시한다. 소비자는 그러한 주제를 구독하고, 들어오는 메시지를 처리하며, 처리가 완료되면 알림을 보낸다
소비자가 구독을 하게 되면, 해당 소비자가 연결이 끊기게 되더라도 모든 메세지는 펄사가 보존한다. 보존된 메시지는 소비자가 그 메세지가 성공적으로 처리되었다고 알림을 보낼때에만 삭제처리된다.
메시지는 펄사의 기본 단위이다. 메시지란 게시자가 주제에 대해 게시하는 대상이자, 소비자가 주제로부터 받아와 소비하는 대상이다(그리고 메시지가 처리되면 알림을 준다). 메시지는 우편 시스템의 편지와 유사하다.
|값 / 데이터 페이로드||The data carried by the message. All Pulsar messages carry raw bytes, although message data can also conform to data schemas.|
|Key||메시지는 선택적으로 키로 태깅될 수 있는데, 이는 주제 압축(topic compaction)과 같은 것에 유용할 수 있다.|
|속성||선택적인 사용자 정의 속성의 키/값 표|
|게시자 이름||메시지를 생산한 게시자의 이름(게시자의 이름은 자동으로 기본값이 부여되지만 명시적으로 직접 지정할 수도 있다.)|
|시퀀스 ID||Each Pulsar message belongs to an ordered sequence on its topic. A message's sequence ID is its ordering in that sequence.|
|게시 시간||메시지가 게시된 시점의 타임스탬프(게시자에 의해 자동으로 적용된다).|
|이벤트 시간||어떤 이벤트(예를 들어 메시지가 처리되었을 때와 같은)가 발생했을 때 이를 나타낼 수 있도록 메시지에 선택적으로 붙이는 타임스탬프. The event time of a message is 0 if none is explicitly set.|
|유형화된 메시지 빌더(TypedMessageBuilder)||유형화된 메시지 빌더는 메시지를 구조화, 혹은 생성하기 위해 사용된다. 메시지 키와 값과 같은 메시지 속성을 TypedMessageBuilder로 설정할 수 있다. TypedMessageBuilder를 설정할 때 모범 사례는 키를 문자열로 설정하는 것이다. 키를 AVRO 객체와 같은 다른 유형으로 설정하게 되면 키는 바이트 형태로 전송되고, 소비자 입장에서 AVRO 객체를 다시 불러오기 어렵다.|
더 깊이 있는 펄사 메시지 콘텐츠를 보고 싶다면, 펄사 이진 프로토콜 을 살펴보라.
게시자는 주제에 붙는 프로세스이자, 처리를 위해 펄사에 메시지를 게시하는 브로커이다.
게시자는 브로커에 동기적(sync) 이나 비동기적(async) 으로 메시지를 보낼 수 있다.
|동기적 송신||The producer will wait for acknowledgement from the broker after sending each message. If acknowledgment isn't received then the producer will consider the send operation a failure.|
|Async send||게시자는 메시지를 블로킹 큐에 집어넣은 후 바로 리턴한다. 그 후 클라이언트 라이브러리는 백그라운드에서 브로커에 메시지를 송신할 것이다. 큐가 가득 차면 (최대 크기 설정 가능), 게시자에게 전달된 입력변수(arguments)에 따라 API를 호출할 때 게시자가 차단되거나 바로 실패할 수도 있다.|
Messages published by producers can be compressed during transportation in order to save bandwidth. Pulsar currently supports the following types of compression:
When batching is enabled, the producer accumulates and sends a batch of messages in a single request. The batch size is defined by the maximum number of messages and the maximum publish latency. Therefore, the backlog size represents the total number of batches instead of the total number of messages.
In Pulsar, batches are tracked and stored as single units rather than as individual messages. Under the hood, the consumer unbundles a batch into individual messages. However, scheduled messages (configured through the
deliverAt or the
deliverAfter parameter) are always sent as individual messages even batching is enabled.
In general, a batch is acknowledged when all its messages are acknowledged by the consumer. This means unexpected failures, negative acknowledgements, or acknowledgement timeouts can result in redelivery of all messages in a batch, even if some of the messages have already been acknowledged.
To avoid redelivering acknowledged messages in a batch to the consumer, Pulsar introduces batch index acknowledgement since Pulsar 2.6.0. When batch index acknowledgement is enabled, the consumer filters out the batch index that has been acknowledged and sends the batch index acknowledgement request to the broker. The broker maintains the batch index acknowledgement status and tracks the acknowledgement status of each batch index to avoid dispatching acknowledged messages to the consumer. When all indexes of the batch message are acknowledged, the batch message is deleted.
By default, batch index acknowledgement is disabled (
acknowledgmentAtBatchIndexLevelEnabled=false). You can enable batch index acknowledgement by setting the
acknowledgmentAtBatchIndexLevelEnabled parameter to
true at the broker side. Enabling batch index acknowledgement may bring more memory overheads. So, perform this operation with caution.
A consumer is a process that attaches to a topic via a subscription and then receives messages.
Messages can be received from brokers either synchronously (sync) or asynchronously (async).
|동기 수신||동기 수신은 메시지가 사용 가능할 때까지 차단된 상태로 있을 것이다.|
|Async receive||비동기 수신은 신규 메시지가 사용 가능해지면 즉시 미래의 값, 예를 들면 자바 언어로 된 |
Client libraries provide listener implementation for consumers. For example, the Java client provides a MesssageListener
interface. In this interface, the
received method is called whenever a new message is received.
소비자가 성공적으로 메시지를 소비하였을 경우에, 소비자는 브로커에게 알림 요청을 보낸다. 이 메시지는 영구적으로저장</ㅁ>되며 모든 구독이 관련 알림을 받았을 때에만 삭제된다. 소비자가 알림을 보내온 메시지를 저장하고 싶을 경우에, 메시지 보관 정책 을 설정할 필요가 있다.
For a batch message, if batch index acknowledgement is enabled, the broker maintains the batch index acknowledgement status and tracks the acknowledgement status of each batch index to avoid dispatching acknowledged messages to the consumer. When all indexes of the batch message are acknowledged, the batch message is deleted. For details about the batch index acknowledgement, see batching.
Messages can be acknowledged either one by one or cumulatively. With cumulative acknowledgement, the consumer only needs to acknowledge the last message it received. All messages in the stream up to (and including) the provided message will not be re-delivered to that consumer.
Messages can be acknowledged in the following two ways:
- Messages are acknowledged individually. With individual acknowledgement, the consumer needs to acknowledge each message and sends an acknowledgement request to the broker.
- Messages are acknowledged cumulatively. With cumulative acknowledgement, the consumer only needs to acknowledge the last message it received. All messages in the stream up to (and including) the provided message are not re-delivered to that consumer.
Note Cumulative acknowledgement cannot be used in the shared subscription mode, because the shared subscription mode involves multiple consumers having access to the same subscription. In the shared subscription mode, messages can be acknowledged individually.
When a consumer does not consume a message successfully at a time, and wants to consume the message again, the consumer can send a negative acknowledgement to the broker, and then the broker will redeliver the message.
Messages can be negatively acknowledged one by one or cumulatively, which depends on the consumption subscription mode.
In the exclusive and failover subscription modes, consumers only negatively acknowledge the last message they have received.
In the shared and Key_Shared subscription modes, you can negatively acknowledge messages individually.
Be aware that negative acknowledgment on ordered subscription types, such as Exclusive, Failover and Key_Shared, can cause failed messages to arrive consumers out of the original order.
Note If batching is enabled, other messages in the same batch may be redelivered to the consumer as well as the negatively acknowledged messages.
When a message is not consumed successfully, and you want to trigger the broker to redeliver the message automatically, you can adopt the unacknowledged message automatic re-delivery mechanism. Client will track the unacknowledged messages within the entire
acktimeout time range, and send a
redeliver unacknowledged messages request to the broker automatically when the acknowledgement timeout is specified.
Note If batching is enabled, other messages in the same batch may be redelivered to the consumer as well as the unacknowledged messages.
Prefer negative acknowledgements over acknowledgement timeout. 부정적인 알림은 더 정확하게 각 독립 메시지의 재전송을 통제하며, 메시지의 처리 시간이 알림 타임아웃 시간을 초과하였을 때에 유효하지 않은 재전송을 사전에 피할 수 있다.
데드 레터 토픽
Dead letter topic enables you to consume new messages when some messages cannot be consumed successfully by a consumer. In this mechanism, messages that are failed to be consumed are stored in a separate topic, which is called dead letter topic. You can decide how to handle messages in the dead letter topic.
The following example shows how to enable dead letter topic in a Java client using the default dead letter topic:
Consumer<byte> consumer = pulsarClient.newConsumer(Schema.BYTES) .topic(topic) .subscriptionName("my-subscription") .subscriptionType(SubscriptionType.Shared) .deadLetterPolicy(DeadLetterPolicy.builder() .maxRedeliverCount(maxRedeliveryCount) .build()) .subscribe();
The default dead letter topic uses this format:
If you want to specify the name of the dead letter topic, use this Java client example:
Consumer<byte> consumer = pulsarClient.newConsumer(Schema.BYTES) .topic(topic) .subscriptionName("my-subscription") .subscriptionType(SubscriptionType.Shared) .deadLetterPolicy(DeadLetterPolicy.builder() .maxRedeliverCount(maxRedeliveryCount) .deadLetterTopic("your-topic-name") .build()) .subscribe();
Dead letter topic depends on message re-delivery. Messages are redelivered either due to acknowledgement timeout or negative acknowledgement. If you are going to use negative acknowledgement on a message, make sure it is negatively acknowledged before the acknowledgement timeout.
Currently, dead letter topic is enabled only in the shared subscription mode.
Retry letter topic
For many online business systems, a message needs to be re-consumed because any exception occurs in the business logic processing. Generally, users hope that they can flexibly configure the delay time for re-consuming the failed messages. In this case, you can configure the producer to send messages to both the business topic and the retry letter topic and you can enable automatic retry on the consumer. When automatic retry is enabled on the consumer, a message is stored in the retry letter topic if the messages fail to be consumed and therefore the consumer can automatically consume failed messages from the retry letter topic after a specified delay time.
By default, automatic retry is disabled. You can set
true to enable automatic retry on the consumer.
This example shows how to consumer messages from a retry letter topic.
Consumer<byte> consumer = pulsarClient.newConsumer(Schema.BYTES) .topic(topic) .subscriptionName("my-subscription") .subscriptionType(SubscriptionType.Shared) .enableRetry(true) .receiverQueueSize(100) .deadLetterPolicy(DeadLetterPolicy.builder() .maxRedeliverCount(maxRedeliveryCount) .retryLetterTopic("persistent://my-property/my-ns/my-subscription-custom-Retry") .build()) .subscriptionInitialPosition(SubscriptionInitialPosition.Earliest) .subscribe();
|토픽 이름 구성요소||Description|
|이는 토픽의 유형을 식별한다. 펄사는 두가지 유형의 토픽을 지원한다: 지속적 토픽 과 비지속적 토픽 (지속적 토픽이 기본값이기 때문에, 토픽의 유형을 명시적으로 지정하지 않을 경우 토픽의 유형은 지속적 토픽이 된다.) 지속적 토픽의 경우, 모든 메시지가 안전하게 디스크에보존된다 이는 곧, 브로커가 스탠드얼론이 아닌 한 다중 디스크에 저장된다는 의미이며, 반면에 비지속적 토픽 의 경우에는 스토리지 디스크에 보존되지 않는다는 뜻이다.|
|The topic's tenant within the instance. Tenants are essential to multi-tenancy in Pulsar and can be spread across clusters.|
|토픽의 관리 단위로서, 관련 토픽을 위한 그룹화 메커니즘으로 작동한다. 대부분의 토픽 설정은 네임스페이스 수준에서 이루어진다. 각 테넌트는 여러 개의 네임스페이스를 가질 수 있다.|
|The final part of the name. Topic names are freeform and have no special meaning in a Pulsar instance.|
No need to explicitly create new topics You don't need to explicitly create topics in Pulsar. If a client attempts to write or receive messages to/from a topic that does not yet exist, Pulsar will automatically create that topic under the namespace provided in the topic name. If no tenant or namespace is specified when a client creates a topic, the topic is created in the default tenant and namespace. You can also create a topic in a specified tenant and namespace, such as
my-topictopic is created in the
my-namespacenamespace of the
A namespace is a logical nomenclature within a tenant. A tenant can create multiple namespaces via the admin API. For instance, a tenant with different applications can create a separate namespace for each application. A namespace allows the application to create and manage a hierarchy of topics. The topic
my-tenant/app1 is a namespace for the application
my-tenant. You can create any number of topics under the namespace.
A subscription is a named configuration rule that determines how messages are delivered to consumers. 펄사에는 네 가지 사용 가능한 구독 모드가 있다. 배타적 모드, 공유 모드, 장애극복 모드 그리고 키 공유 모드. These modes are illustrated in the figure below.
Pub-Sub, Queuing, or Both There is a lot of flexibility in how to combine subscriptions: * If you want to achieve traditional "fan-out pub-sub messaging" among consumers, you can make each consumer have a unique subscription name (exclusive) * If you want to achieve "message queuing" among consumers, you can make multiple consumers have the same subscription name (shared, failover, key_shared) * If you want to do both simultaneously, you can have some consumers with exclusive subscriptions while others do not
In exclusive mode, only a single consumer is allowed to attach to the subscription. If more than one consumer attempts to subscribe to a topic using the same subscription, the consumer receives an error.
In the diagram below, only Consumer A-0 is allowed to consume messages.
Exclusive mode is the default subscription mode.
In failover mode, multiple consumers can attach to the same subscription. A master consumer is picked for non-partitioned topic or each partition of partitioned topic and receives messages. When the master consumer disconnects, all (non-acknowledged and subsequent) messages are delivered to the next consumer in line.
For partitioned topics, broker will sort consumers by priority level and lexicographical order of consumer name. Then broker will try to evenly assigns topics to consumers with the highest priority level.
For non-partitioned topic, broker will pick consumer in the order they subscribe to the non partitioned topic.
In the diagram below, Consumer-B-0 is the master consumer while Consumer-B-1 would be the next consumer in line to receive messages if Consumer-B-0 is disconnected.
In shared or round robin mode, multiple consumers can attach to the same subscription. Messages are delivered in a round robin distribution across consumers, and any given message is delivered to only one consumer. When a consumer disconnects, all the messages that were sent to it and not acknowledged will be rescheduled for sending to the remaining consumers.
In the diagram below, Consumer-C-1 and Consumer-C-2 are able to subscribe to the topic, but Consumer-C-3 and others could as well.
Limitations of shared mode When using shared mode, be aware that: * Message ordering is not guaranteed. * You cannot use cumulative acknowledgment with shared mode.
키 공유 모드
In Key_Shared mode, multiple consumers can attach to the same subscription. Messages are delivered in a distribution across consumers and message with same key or same ordering key are delivered to only one consumer. No matter how many times the message is re-delivered, it is delivered to the same consumer. When a consumer connected or disconnected will cause served consumer change for some key of message.
Limitations of Key_Shared mode When you use Key_Shared mode, be aware that: * You need to specify a key or orderingKey for messages. * You cannot use cumulative acknowledgment with Key_Shared mode. * Your producers should disable batching or use a key-based batch builder.
키 공유 모드의 구독을 비활성화 하기 위해서는
broker.config 파일을 수정하면 된다.
다중 토픽 구독
When a consumer subscribes to a Pulsar topic, by default it subscribes to one specific topic, such as
persistent://public/default/my-topic. As of Pulsar version 1.23.0-incubating, however, Pulsar consumers can simultaneously subscribe to multiple topics. You can define a list of topics in two ways:
- On the basis of a regular expression (regex), for example
- 토픽의 리스트를 명시적으로 정의하여 다중 토픽 구독을 사용할 수 있다.
When subscribing to multiple topics by regex, all topics must be in the same namespace
When subscribing to multiple topics, the Pulsar client will automatically make a call to the Pulsar API to discover the topics that match the regex pattern/list and then subscribe to all of them. If any of the topics don't currently exist, the consumer will auto-subscribe to them once the topics are created.
No ordering guarantees across multiple topics When a producer sends messages to a single topic, all messages are guaranteed to be read from that topic in the same order. However, these guarantees do not hold across multiple topics. So when a producer sends message to multiple topics, the order in which messages are read from those topics is not guaranteed to be the same.
Here are some multi-topic subscription examples for Java:
import java.util.regex.Pattern; import org.apache.pulsar.client.api.Consumer; import org.apache.pulsar.client.api.PulsarClient; PulsarClient pulsarClient = // Instantiate Pulsar client object // Subscribe to all topics in a namespace Pattern allTopicsInNamespace = Pattern.compile("persistent://public/default/.*"); Consumer<byte> allTopicsConsumer = pulsarClient.newConsumer() .topicsPattern(allTopicsInNamespace) .subscriptionName("subscription-1") .subscribe(); // Subscribe to a subsets of topics in a namespace, based on regex Pattern someTopicsInNamespace = Pattern.compile("persistent://public/default/foo.*"); Consumer<byte> someTopicsConsumer = pulsarClient.newConsumer() .topicsPattern(someTopicsInNamespace) .subscriptionName("subscription-1") .subscribe();
For code examples, see:
Normal topics can be served only by a single broker, which limits the topic's maximum throughput. Partitioned topics are a special type of topic that can be handled by multiple brokers, which allows for much higher throughput.
Behind the scenes, a partitioned topic is actually implemented as N internal topics, where N is the number of partitions. When publishing messages to a partitioned topic, each message is routed to one of several brokers. The distribution of partitions across brokers is handled automatically by Pulsar.
The diagram below illustrates this:
Here, the topic Topic1 has five partitions (P0 through P4) split across three brokers. Because there are more partitions than brokers, two brokers handle two partitions a piece, while the third handles only one (again, Pulsar handles this distribution of partitions automatically).
Decisions about routing and subscription modes can be made separately in most cases. In general, throughput concerns should guide partitioning/routing decisions while subscription decisions should be guided by application semantics.
There is no difference between partitioned topics and normal topics in terms of how subscription modes work, as partitioning only determines what happens between when a message is published by a producer and processed and acknowledged by a consumer.
Partitioned topics need to be explicitly created via the admin API. The number of partitions can be specified when creating the topic.
When publishing to partitioned topics, you must specify a routing mode. The routing mode determines which partition---that is, which internal topic---each message should be published to.
There are three MessageRoutingMode available:
|키 값이 제공되지 않으면, 제공자가 라운드 로빈 방식으로 최대 처리율(throughput) 을 달성하기 위해 모든 파티션에 걸쳐 메시지를 게시할 것이다. 각 메시지 별로 라운드 로빈 방식이 실행되지는 않고, 배칭이 효과적으로 실행될 수 있도록 보장하기 위해 동일한 배칭 지연 시간의 한도(경계) 내에 설정된다는 것에 주의하라. 키 값이 메시지에 설정되고 나면, 분할된 게시자가 키를 해시하여 특정 파티션에 메시지를 할당할 것이다. 이것이 기본 모드다.|
|키 값이 제공되지 않으면, 게시자는 단일 파티션을 랜덤하게 선택하여 모든 메시지를 해당 파티션에 게시할 것이다. 키 값이 메시지에 설정되고 나면, 분할된 게시자가 키를 해시하여 특정 파티션에 메시지를 할당할 것이다.|
|특정 메시지가 할당되는 파티션을 결정하기 위해 커스텀 메시지 라우터 구현(custom message router implementation) 이 호출되도록 한다. 사용자는 자바 클라이언트와 MessageRouter 인터페이스를 구현하여 커스텀 라우팅 모드를 생성할 수 있다.|
The ordering of messages is related to MessageRoutingMode and Message Key. Usually, user would want an ordering of Per-key-partition guarantee.
If there is a key attached to message, the messages will be routed to corresponding partitions based on the hashing scheme specified by HashingScheme
, when using either
|Ordering guarantee||Description||라우팅 모드와 키|
|단일 키 별 파티션||같은 키값을 가진 모든 메시지는 순서대로 정렬되어 동일한 파티션 내에 위치한다.|
|게시자 별||같은 게시자로부터 온 모든 메시지는 순서대로 정렬될 것이다.||단일 파티션(|
HashingScheme is an enum that represent sets of standard hashing functions available when choosing the partition to use for a particular message.
There are 2 types of standard hashing functions available:
Murmur3_32Hash. The default hashing function for producer is
JavaStringHash. Please pay attention that
JavaStringHash is not useful when producers can be from different multiple language clients, under this use case, it is recommended to use
By default, Pulsar persistently stores all unacknowledged messages on multiple BookKeeper bookies (storage nodes). Data for messages on persistent topics can thus survive broker restarts and subscriber failover.
Pulsar also, however, supports non-persistent topics, which are topics on which messages are never persisted to disk and live only in memory. When using non-persistent delivery, killing a Pulsar broker or disconnecting a subscriber to a topic means that all in-transit messages are lost on that (non-persistent) topic, meaning that clients may see message loss.
Non-persistent topics have names of this form (note the
non-persistent in the name):
For more info on using non-persistent topics, see the Non-persistent messaging cookbook.
In non-persistent topics, brokers immediately deliver messages to all connected subscribers without persisting them in BookKeeper. If a subscriber is disconnected, the broker will not be able to deliver those in-transit messages, and subscribers will never be able to receive those messages again. Eliminating the persistent storage step makes messaging on non-persistent topics slightly faster than on persistent topics in some cases, but with the caveat that some of the core benefits of Pulsar are lost.
With non-persistent topics, message data lives only in memory. If a message broker fails or message data can otherwise not be retrieved from memory, your message data may be lost. Use non-persistent topics only if you're certain that your use case requires it and can sustain it.
By default, non-persistent topics are enabled on Pulsar brokers. You can disable them in the broker's configuration. You can manage non-persistent topics using the
pulsar-admin topics command. For more information, see
Non-persistent messaging is usually faster than persistent messaging because brokers don't persist messages and immediately send acks back to the producer as soon as that message is delivered to connected brokers. Producers thus see comparatively low publish latency with non-persistent topic.
Producers and consumers can connect to non-persistent topics in the same way as persistent topics, with the crucial difference that the topic name must start with
non-persistent. All three subscription modes---exclusive, shared, and failover---are supported for non-persistent topics.
Here's an example Java consumer for a non-persistent topic:
PulsarClient client = PulsarClient.builder() .serviceUrl("pulsar://localhost:6650") .build(); String npTopic = "non-persistent://public/default/my-topic"; String subscriptionName = "my-subscription-name"; Consumer<byte> consumer = client.newConsumer() .topic(npTopic) .subscriptionName(subscriptionName) .subscribe();
Here's an example Java producer for the same non-persistent topic:
Producer<byte> producer = client.newProducer() .topic(npTopic) .create();
Message retention and expiry
By default, Pulsar message brokers:
- immediately delete all messages that have been acknowledged by a consumer, and
- 메시지 백로그에 있는 알림이 전송되지 않은 모든 메시지를 영구적으로 저장한다.
Pulsar has two features, however, that enable you to override this default behavior:
- Message retention enables you to store messages that have been acknowledged by a consumer
- Message expiry enables you to set a time to live (TTL) for messages that have not yet been acknowledged
The diagram below illustrates both concepts:
With message retention, shown at the top, a retention policy applied to all topics in a namespace dictates that some messages are durably stored in Pulsar even though they've already been acknowledged. Acknowledged messages that are not covered by the retention policy are deleted. Without a retention policy, all of the acknowledged messages would be deleted.
With message expiry, shown at the bottom, some messages are deleted, even though they haven't been acknowledged, because they've expired according to the TTL applied to the namespace (for example because a TTL of 5 minutes has been applied and the messages haven't been acknowledged but are 10 minutes old).
Message duplication occurs when a message is persisted by Pulsar more than once. Message deduplication is an optional Pulsar feature that prevents unnecessary message duplication by processing each message only once, even if the message is received more than once.
The following diagram illustrates what happens when message deduplication is disabled vs. enabled:
Message deduplication is disabled in the scenario shown at the top. Here, a producer publishes message 1 on a topic; the message reaches a Pulsar broker and is persisted to BookKeeper. The producer then sends message 1 again (in this case due to some retry logic), and the message is received by the broker and stored in BookKeeper again, which means that duplication has occurred.
In the second scenario at the bottom, the producer publishes message 1, which is received by the broker and persisted, as in the first scenario. When the producer attempts to publish the message again, however, the broker knows that it has already seen message 1 and thus does not persist the message.
Message deduplication is handled at the namespace level. For more instructions, see the message deduplication cookbook.
The other available approach to message deduplication is to ensure that each message is only produced once. This approach is typically called producer idempotency. The drawback of this approach is that it defers the work of message deduplication to the application. In Pulsar, this is handled at the broker level, which means that you don't need to modify your Pulsar client code. Instead, you only need to make administrative changes (see the Managing message deduplication cookbook for a guide).
Deduplication and effectively-once semantics
Message deduplication makes Pulsar an ideal messaging system to be used in conjunction with stream processing engines (SPEs) and other systems seeking to provide effectively-once processing semantics. Messaging systems that don't offer automatic message deduplication require the SPE or other system to guarantee deduplication, which means that strict message ordering comes at the cost of burdening the application with the responsibility of deduplication. With Pulsar, strict ordering guarantees come at no application-level cost.
지연된 메시지 전송
Delayed message delivery enables you to consume a message later rather than immediately. In this mechanism, a message is stored in BookKeeper,
DelayedDeliveryTracker maintains the time index(time -> messageId) in memory after published to a broker, and it is delivered to a consumer once the specific delayed time is passed.
Delayed message delivery only works well in Shared subscription mode. In Exclusive and Failover subscription mode, the delayed message is dispatched immediately.
The diagram below illustrates the concept of delayed message delivery:
A broker saves a message without any check. When a consumer consumes a message, if the message is set to delay, then the message is added to
DelayedDeliveryTracker. A subscription checks and gets timeout messages from
Delayed message delivery is enabled by default. You can change it in the broker configuration file as below:
The following is an example of delayed message delivery for a producer in Java:
// message to be delivered at the configured delay interval producer.newMessage().deliverAfter(3L, TimeUnit.Minute).value("Hello Pulsar!").send();