

InfluxDB used to have a reasonably open license, and it was deemed acceptable (and it was quite popular and familiar) back when Aiven started. The final piece of our observability platform is Grafana, which enables us to both visualize metrics trends and set up alerts on top of the time-series database.

Once the data is in Kafka, we then distribute it to various metrics backends, internal or external, again using Telegraf. Metrics are sent to Kafka using Telegraf. Let’s get started.Īiven’s platform generates a huge amount of data: we have tens of thousands of nodes running, and we collect thousands of metrics for each node. Here’s how we decided to migrate from InfluxDB to M3, what we needed to achieve with M3, and how the migration went. Before M3, we were running with single-node InfluxDB as our backend time series database (TSDB), but the increasing number of nodes and metrics we collected began to show the boundaries of the setup, forcing us to find something new that could scale with our needs. M3 is the foundation of our monitoring platform, storing all our metrics, and we chose it because it was the best heavy-duty option. We need to continuously monitor our systems, so if things do go wrong, we have all the information we need to fix problems quickly. That comes with a 99.99% uptime service-level agreement (SLA), which is a tall order.

At Aiven, we provide open source data platforms as managed services on multiple cloud platforms.
