Bufstream is a fully self-hosted drop-in replacement for Apache Kafka® that writes data to S3-compatible object storage. It's 100% compatible with the Kafka protocol, including support for exactly-once semantics (EOS) and transactions. Bufstream is 8x cheaper to operate, and a single cluster can elastically scale to hundreds of GB/s of throughput. It's the universal Kafka replacement for the modern age.
Additionally, for teams sending Protobuf messages across their Kafka topics, Bufstream is a perfect partner. Bufstream can enforce data quality and governance requirements on the broker with Protovalidate. Bufstream can directly persist records as Apache Iceberg™ tables, reducing time-to-insight in popular data lakehouse products such as Snowflake or ClickHouse.
This repository contains code used in Bufstream's quickstart, which steps through and explains Bufstream's capabilities.
If you're here for a quick test drive or to demonstrate Bufstream:
- Clone this repository and navigate to its directory.
- Keep reading to learn how to use the included
maketargets.
Semantic validation with Go installed
- Use
make bufstream-runto download and run Bufstream's single binary. - In a second terminal, run
make produce-runto produce sample e-commerce shopping cart messages. - In a third terminal, run
make consume-runto start consuming messages. About 1% contain semantically invalid messages and cause errors. - Stop the producer. Run
make use-reject-mode. Restart the producer: it logs errors when it tries to produce invalid messages. The consumer soon stops logging errors about invalid carts. - Stop the producer. Run
make use-dlq-mode. Restart the producer: there are no more errors. All invalid messages are sent to a DLQ topic. - In a fourth terminal, run
make consume-dlq-run. It reads theorders.dlqtopic and shows that the original message can be reconstructed and examined. - Stop all processes before continuing to Iceberg.
Semantic validation without Go installed
If you don't have Go installed, you can still run this demonstration via a Docker Compose project.
- Use
make docker-compose-runto start the Compose project. The producer immediately begins producing sample e-commerce shopping cart messages. About 1% of the messages are semantically invalid and cause the consumer to log errors. - Open a second terminal and run
make docker-compose-use-reject-mode. Back in the first terminal, invalid messages are now rejected: the producer logs errors and the consumer stops receiving any invalid messages. - In the second terminal, run
make docker-compose-use-dlq-mode. The producer stops receiving errors, and the DLQ consumer begins logging invalid messages sent to theorders.dlqtopic. - Stop the Compose project and use
make docker-compose-cleanbefore continuing to Iceberg.
The Iceberg demo uses the Docker Compose project defined in ./iceberg/docker-compose.yaml to provide services such as an Iceberg catalog and Spark.
- Run
make iceberg-runto start the Iceberg project. The Spark image is a large download, and there are multiple services to start. When you seecreate-orders-topic-1 exited with code 0, continue. - If you have Go installed, run
make iceberg-produce-runin a new terminal to create sample data. If you don't have Go installed, usemake-iceberg-produce-run-docker. Once you've produced about 1,000 records, stop the process. - Run
make iceberg-tableto manually run the Bufstream task that updates Iceberg catalogs. - Open http://localhost:8888/notebooks/notebooks/bufstream-quickstart.ipynb, click within the SELECT query's cell, and use shift-return or the ▶︎ icon to build a revenue report based on the
orderstopic. - This example is durable: the Compose project can be stopped and started without losing data. To remove all data and images, stop the Compose project and run
make iceberg-clean.
To learn more about Bufstream, check out the launch blog post, dig into the benchmark and cost analysis, or join us in the Buf Slack!