Project Metamorphosis: Unveiling the next-gen event streaming platformLearn More

Monitoring Confluent Platform with Datadog

Earning customer love is a core value at Confluent, and like all relationships, listening makes the love flourish. When it comes to monitoring, we’ve heard you, and we are pleased to announce a new agent-based integration with Datadog for monitoring Confluent Platform. With this new integration, our mutual customers will be able to monitor Confluent Platform alongside the rest of their IT infrastructure components and leverage the flexible capabilities of Datadog’s monitoring solution.

Confluent Control Center continues to be a great option for the fully integrated view of Confluent Platform when you are looking to monitor and manage in a single UI experience. It provides a highly opinionated view of metrics for helping make administrative decisions, and we will continue our investment in this area. However, we also know that our customers need to have an integrated monitoring view across their entire IT stack to feel confident in production, so we’re partnering with Datadog to make that as easy as possible.

How it works

Datadog has had an Apache Kafka® integration for monitoring self-managed broker installations (and associated Apache ZooKeeper™ deployments) with their Datadog Agent for several years. The Confluent Platform integration adds several new capabilities:

  • Monitoring for Kafka Connect, ksqlDB, Confluent Schema Registry, and Confluent REST Proxy
  • Monitoring for Java-based Kafka clients
  • Default Confluent Platform dashboard with the most critical metrics
  • Optionally configured log collection

This integration comes pre-installed starting in Agent v6.19 and v7.19. Users unable to upgrade Agent versions can install the integration on the Agent with a simple command:

sudo -u dd-agent datadog-agent integration install datadog-confluent_platform==1.0.0

Once this integration is installed, start pulling in the JMX metrics from Confluent Platform services by going to the confluent_platform.d directory in the Agent’s conf.d directory, and take a look at the conf.yaml.example file located there. Then modify the instances stanza to add all of the hosts you want to monitor. Restart the Agent, and you will start to see metrics emitted to Datadog for all of the Confluent Platform services that you have configured. The Agent will also start collecting host-level system metrics related to CPU, memory, disk I/O, and network I/O. For more details, see the documentation.

Here’s an example of what the default dashboard that comes with the integration looks like once you have configured the Agent:

Datadog Dashboard

This dashboard gives you a starting point for showcasing the different components. We recommend that you copy this dashboard and then start slicing and dicing the data as you need to for your business.

Control Center 🤝 Datadog: Better together

If you use Datadog as your central monitoring tool, there is complementary value in utilizing Control Center’s management capabilities. Take a look at the Failed Task Count by Worker graph in the dashboard above. How do you know which steps to take next? By using Control Center’s capabilities to drill into Connect issues and fix them in the same UI, you can first see at a high level how many connectors are impacted by the failed tasks:

All Connect Clusters

Then you can get further details on the tasks from the degraded connector to get an idea of which hosts are impacted so you can determine if the problem is with the connector or with a particular set of tasks. In this case, you can see that the connector as a whole has failed, so you know the problem is likely with the configuration of the connector or with a downstream system. You can start debugging the connector configuration directly in the Control Center UI.file-sink9

You can take a similar approach when you see anomalies in the number of ksqlDB queries running. Most clusters will experience a steady rate of querying, but what if you start to see a spike in running queries in the Datadog dashboard? You can use Control Center to take a look at what queries are running to find out if there is a new, unexpected use case that is driving the load.

ksqlDB Queries

If you are using Datadog today, go try the integration! We’re excited to expand our integrations with popular monitoring tools to help our customers get the maximum value across their organization from Confluent Platform. If you have a monitoring tool that you’d like to see integrated in a more seamless way, please send us your feedback in #observability on the Community Slack!

Dustin Cote has spent over four years helping customers wrangle their data and infrastructure using a variety of open source technologies ranging from Apache™ Hadoop® to Kafka. At Confluent, he helps customers stabilize and scale their deployments of Apache Kafka.

Did you like this blog post? Share it now

Subscribe to the Confluent blog

More Articles Like This

Build Real-Time Observability Pipelines with Confluent Cloud and AppDynamics

Many organisations rely on commercial or open source monitoring tools to measure the performance and stability of business-critical applications. AppDynamics, Datadog, and Prometheus are widely used commercial and open source […]

Real-Time Fleet Management Using Confluent Cloud and MongoDB

Most organisations maintain fleets, a collection of vehicles put to use for day-to-day operations. Telcos use a variety of vehicles including cars, vans, and trucks for service, delivery, and maintenance. […]

Introducing Confluent Platform 5.5

We are pleased to announce the release of Confluent Platform 5.5. With this release, Confluent makes event streaming more broadly accessible to developers of all backgrounds, enhancing three categories of […]

Sign Up Now

Start your 3-month trial. Get up to $200 off on each of your first 3 Confluent Cloud monthly bills

新規登録のみ。

上の「新規登録」をクリックすることにより、当社がお客様の個人情報を以下に従い処理することを理解されたものとみなします : プライバシーポリシー

上記の「新規登録」をクリックすることにより、お客様は以下に同意するものとします。 サービス利用規約 Confluent からのマーケティングメールの随時受信にも同意するものとします。また、当社がお客様の個人情報を以下に従い処理することを理解されたものとみなします: プライバシーポリシー

単一の Kafka Broker の場合には永遠に無料
i

商用版の機能を単一の Kafka Broker で無期限で使用できるソフトウェアです。2番目の Broker を追加すると、30日間の商用版試用期間が自動で開始します。この制限を単一の Broker へ戻すことでリセットすることはできません。

デプロイのタイプを選択
手動デプロイ
  • tar
  • zip
  • deb
  • rpm
  • docker
または
自動デプロイ
  • kubernetes
  • ansible

上の「無料ダウンロード」をクリックすることにより、当社がお客様の個人情報をプライバシーポリシーに従い処理することを理解されたものとみなします。 プライバシーポリシー

以下の「ダウンロード」をクリックすることにより、お客様は以下に同意するものとします。 Confluent ライセンス契約 Confluent からのマーケティングメールの随時受信にも同意するものとします。また、お客様の個人データが以下に従い処理することにも同意するものとします: プライバシーポリシー

このウェブサイトでは、ユーザーエクスペリエンスの向上に加え、ウェブサイトのパフォーマンスとトラフィック分析のため、Cookie を使用しています。また、サイトの使用に関する情報をソーシャルメディア、広告、分析のパートナーと共有しています。