Skip to content

Topic load shedding causes Producers (including built in one in connection object) to fail until you manually close them after error #378

@chamons

Description

@chamons

Sample Project

Here are two unit tests that show the issue at hand:

https://gist.github.com/chamons/f78672c4bb659aeb1e8499a6925e5f3b

Replace broker-1 with your local pulsar cluster or use this docker compose.

Details

After a topic load moves to another broker, due to:

  • Pulsar broker restarts
  • Load shedding due to high CPU
  • Calling admin/v2/persistent/public/default/{topic}/unload or the pulsar-admin command

Sending notifications via a MultiTopicProducer to that topic will now fail with:

Producer(Connection(Io(Custom { kind: TimedOut, error: " connection c0010f8b-9ae8-4b31-abba-91d83e283695 timedout sending message to the Pulsar server" }))

However, if you close the consumer and then try again, it works just fine.

This is rather non-obvious, however it makes the Pulsar::send simplified API unusable in these cases, as there is now way to manually close it. This means once you write to a topic using it, if it load sheds then that entire connection's Pulsar::send can not write to it again.

We work around this in our code base by only using MultiTopicProducer and closing it after a subset of errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions