Project Metamorphosis: Unveiling the next-gen event streaming platformLearn More

290 Reasons to Upgrade to Apache Kafka 0.9.0.0

When we released Apache Kafka 0.9.0.0, we talked about all of the big new features we added: the new consumer, Kafka Connect, security features, and much more. What we didn’t talk about was something even more important, something that we had spent even more of our time on — correctness, bug fixes, and operability. These are always more important than new features.

According to Apache JIRA, 290 bugs have been fixed for 0.9.0.0 release and some of them are quite important. Even more exciting is the fact that while working on 0.9.0.0, we added a brand new distributed testing framework and over 100 new test scenarios that use this framework. We are now testing replication, node failures, controller failures, MirrorMaker, rolling upgrades, security scenarios, Kafka connect failures, and much more. This allowed us to not only catch many issues for this release but will give us the confidence that we are maintaining the high quality that Kafka is known for in the future.

Here are some of the more noteworthy bugs we caught and fixes for Apache Kafka 0.9.0.0:

    1. Replication is the backbone of Kafka. When replication goes wrong, bad things happen. In Kafka 0.9.0.0 we fixed varied replication issues. For example, we found and fixed an obscure race condition where if a machine ever gets slow enough that context switching between threads is slower than a remote call, it is possible for a broker to think it has fallen out of sync and as a result delete all its data (KAFKA-2477), min.insync.replica default configuration not working as expected (KAFKA-2114) and replication lag being impossible to configure (KAFKA-1546).
    2. MirrorMaker is Kafka’s cross-cluster replication tool. In the 0.8 release line, MirrorMaker buffered messages between the consumers reading from source cluster and the producers writing to the destination. Consumed offsets were stored using a separate thread (marking messages as “done”). When MirrorMaker process crashed, in some cases messages in the buffer were considered “done” even though they were never written to the target cluster, thereby losing these messages. Kafka 0.9.0.0 includes a newly refactored MirrorMaker with a simpler design that prevents message loss by making sure message offsets are stored only when we are certain the messages were written safely to the target cluster. (KAFKA-1997).
    3. Kafka application logs can be too chatty at INFO level but too quiet at WARN level. This makes it difficult to troubleshoot issues and sometimes causes false alarms. In Kafka 0.9.0.0 we cleaned up the logs, making them more managable (See: KAFKA-2504, KAFKA-2288, KAFKA-2251, KAFKA-2522,  KAFKA-1461)
    4. Log Compaction is one of the most exciting Kafka features, enabling a variety of new use-cases. Unfortunately, it also had some nasty bugs, so many users opted out even for use cases where compaction was a natural fit. For 0.9.0.0 we fixed a large number of log compaction bugs and limitations. The biggest improvement is the ability to compact topics with compressed messages (KAFKA-1374), but there was a very large number of additional improvements (KAFKA-2235. KAFKA-2163, KAFKA-2118, KAFKA-2660, KAFKA-1755).
    5. Connection leak can be an issue in a shared environments when applications connecting to Kafka can’t be relied on to properly close their connections. Kafka 0.9.0.0 includes two patches that make the server much more efficient at detecting and cleaning dead connections (KAFKA-1282, KAFKA-2096).
    6. Kafka broker metadata includes a list of leaders and replicas for each partition. This metadata is stored in ZooKeeper and is also cached in memory of each broker. 0.9.0.0 release includes multiple bug fixes for cases where the metadata cache falls out of sync (KAFKA-1867, KAFKA-1367, KAFKA-2722, KAFKA-972).
    7. The Request Purgatory, where client requests wait until they can be responded to, underwent a complete re-write into a far more efficient data structure in 0.9.0.0. In the process we also fixed a bug where the purgatory was growing out of control (KAFKA-2147).
    8. Producer timeouts for the new producer were not strictly enforced in 0.8.2, so some operations would block for much longer than specified timeout. In 0.9.0.0 the tracking of timeouts was improved and timeouts are now consistent and work as expected (KAFKA-2120).

I expect some of these issues may ring an alarm bell, maybe even a loud and annoying bell, in which case the reason to upgrade to Kafka 0.9.0.0 should be clear. Even if you have not come across any of these issues yet, you don’t know when you will. It is much better to have time to plan for an upgrade, rather than have to upgrade under pressure because your production system just hit a bug that was fixed 8 months ago. 

To make things easier, Apache Kafka can be upgraded with no downtime by using rolling upgrades. Check our documentation to learn the exact process and start planning your upgrade.

Did you like this blog post? Share it now

Subscribe to the Confluent blog

More Articles Like This

Stream Processing with IoT Data: Challenges, Best Practices, and Techniques

The rise of IoT devices means that we have to collect, process, and analyze orders of magnitude more data than ever before. As sensors and devices become ever more ubiquitous, […]

Best Practices to Secure Your Apache Kafka Deployment

For many organizations, Apache Kafka® is the backbone and source of truth for data systems across the enterprise. Protecting your event streaming platform is critical for data security and often […]

Highly Available, Fault-Tolerant Pull Queries in ksqlDB

One of the most critical aspects of any scale-out database is its availability to serve queries during partial failures. Business-critical applications require some measure of resilience to be able to […]

Sign Up Now

Start your 3-month trial. Get up to $200 off on each of your first 3 Confluent Cloud monthly bills

新規登録のみ。

上の「新規登録」をクリックすることにより、当社がお客様の個人情報を以下に従い処理することを理解されたものとみなします : プライバシーポリシー

上記の「新規登録」をクリックすることにより、お客様は以下に同意するものとします。 サービス利用規約 Confluent からのマーケティングメールの随時受信にも同意するものとします。また、当社がお客様の個人情報を以下に従い処理することを理解されたものとみなします: プライバシーポリシー

単一の Kafka Broker の場合には永遠に無料
i

商用版の機能を単一の Kafka Broker で無期限で使用できるソフトウェアです。2番目の Broker を追加すると、30日間の商用版試用期間が自動で開始します。この制限を単一の Broker へ戻すことでリセットすることはできません。

デプロイのタイプを選択
手動デプロイ
  • tar
  • zip
  • deb
  • rpm
  • docker
または
自動デプロイ
  • kubernetes
  • ansible

上の「無料ダウンロード」をクリックすることにより、当社がお客様の個人情報をプライバシーポリシーに従い処理することを理解されたものとみなします。 プライバシーポリシー

以下の「ダウンロード」をクリックすることにより、お客様は以下に同意するものとします。 Confluent ライセンス契約 Confluent からのマーケティングメールの随時受信にも同意するものとします。また、お客様の個人データが以下に従い処理することにも同意するものとします: プライバシーポリシー

このウェブサイトでは、ユーザーエクスペリエンスの向上に加え、ウェブサイトのパフォーマンスとトラフィック分析のため、Cookie を使用しています。また、サイトの使用に関する情報をソーシャルメディア、広告、分析のパートナーと共有しています。