Hosting a Monitoring Stack - Grafana, InfluxDB, and Telegraf

Why self-hosted or client-hosted monitoring?

Cloud monitoring services work fine until your client argues paying per host, retention capping and the querying of raw data isn’t fast enough.

The TIG stack is one of the most popular open-source monitoring combinations:

Telegraf — a lightweight agent that collects metrics (CPU, RAM, disk, Docker stats, network, and hundreds of other inputs) and pushes them to a time-series database
InfluxDB — a purpose-built time-series database that stores and queries metrics efficiently
Grafana — a visualization platform that connects to InfluxDB and turns raw metrics into dashboards, charts, and alerts

Telegraf agents run on each host, push metrics to a central InfluxDB instance, and Grafana queries InfluxDB to render dashboards.

Architecture overview

Host #1

Telegraf

Host #2

Telegraf

Host #3

Telegraf

▼HTTP (port 8086)

InfluxDB 2

time-series database

▼Flux queries

Grafana

dashboards & alerts

In this guide, InfluxDB and Grafana run on one central server. Telegraf runs on every host you want to monitor — including the central server itself.

Prerequisites

One Linux server for InfluxDB + Grafana (the monitoring server) — Docker and Docker Compose installed
One or more hosts where you want Telegraf agents collecting metrics — Docker installed on each
Basic familiarity with Docker Compose

1. Deploy InfluxDB

Create a project directory on your monitoring server:

mkdir -p ~/monitoring
cd ~/monitoring

Create a compose.yaml:

services:
  influxdb:
    image: influxdb:2
    restart: unless-stopped
    ports:
      - "8086:8086"
    volumes:
      - influxdb-data:/var/lib/influxdb2
      - influxdb-config:/etc/influxdb2
    environment:
      DOCKER_INFLUXDB_INIT_MODE: setup
      DOCKER_INFLUXDB_INIT_USERNAME: admin
      DOCKER_INFLUXDB_INIT_PASSWORD: <your-influxdb-password>
      DOCKER_INFLUXDB_INIT_ORG: homelab
      DOCKER_INFLUXDB_INIT_BUCKET: telegraf
      DOCKER_INFLUXDB_INIT_RETENTION: 90d
      DOCKER_INFLUXDB_INIT_ADMIN_TOKEN: <your-influxdb-token>

volumes:
  influxdb-data:
  influxdb-config:

Start InfluxDB:

cd ~/monitoring
sudo docker compose up -d

Verify InfluxDB is running by visiting http://<your-server-ip>:8086. You should see the InfluxDB UI. Log in with the username and password from your compose environment.

sudo docker compose exec influxdb influx bucket list \
  --org homelab \
  --token <your-influxdb-token>

You should see the telegraf bucket in the output.

2. Deploy Grafana

Add Grafana to your existing compose.yaml or create a new compose:

services:
  influxdb:
    # ... (keep existing InfluxDB service as-is)

  grafana:
    image: grafana/grafana:latest
    restart: unless-stopped
    ports:
      - "3000:3000"
    volumes:
      - grafana-data:/var/lib/grafana
    environment:
      GF_SECURITY_ADMIN_USER: admin
      GF_SECURITY_ADMIN_PASSWORD: <your-grafana-password>
    depends_on:
      - influxdb

volumes:
  influxdb-data:
  influxdb-config:
  grafana-data:

Deploy the updated stack:

cd ~/monitoring
sudo docker compose up -d

Open http://<your-server-ip>:3000 and log in with the admin credentials. Grafana will prompt you to change the password on first login.

Connect InfluxDB as a data source

In Grafana, go to Connections → Data sources → Add data source
Select InfluxDB
Set the query language to Flux
URL: http://influxdb:8086 (container-to-container networking)
Under InfluxDB Details:
- Organization: homelab
- Token: your InfluxDB admin token or create a new granular API token in influxdb
- Default Bucket: telegraf
Click Save & Test — you should see a green success message

3. Deploy Telegraf agents

Deploy one Telegraf agent on each host you want to monitor.

Create the Telegraf configuration

On each host, create a config directory and the Telegraf config file:

mkdir -p ~/telegraf

Create ~/telegraf/telegraf.conf:

# Global agent settings
[agent]
  interval = "15s"
  round_interval = true
  flush_interval = "15s"
  hostname = "" # auto-detected, or set manually per host

# Output to InfluxDB 2.x
[[outputs.influxdb_v2]]
  urls = ["http://<monitoring-server-ip>:8086"]
  token = "<your-influxdb-token>"
  organization = "homelab"
  bucket = "telegraf"

# System metrics
[[inputs.cpu]]
  percpu = true
  totalcpu = true

[[inputs.mem]]

[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]

[[inputs.diskio]]

[[inputs.net]]

[[inputs.system]]

# Docker container metrics
[[inputs.docker]]
  endpoint = "unix:///var/run/docker.sock"
  gather_services = false
  timeout = "5s"

[agent]
  interval = "30s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "30s"
  flush_jitter = "0s"
  precision = ""
  hostname = "${TELEGRAF_HOSTNAME}"
  omit_hostname = false

[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]

[[inputs.mem]]
  fieldexclude = ["active", "inactive", "wired", "laundry"]
[[inputs.cpu]]
  percpu = false
  totalcpu = true
  fieldinclude = ["usage_user", "usage_system", "usage_idle", "usage_iowait"]

[[inputs.swap]]

[[inputs.docker]]
  endpoint = "unix:///var/run/docker.sock"
  gather_services = false
  timeout = "5s"

[[inputs.prometheus]]
  urls = ["${INFLUX_URL}/metrics"]
  metric_version = 2
  fieldinclude = ["storage_shard_disk_size"]
  interval = "5m"

[[outputs.influxdb_v2]]
  alias = "${TELEGRAF_HOSTNAME}"
  urls = ["${INFLUX_URL}"]
  token = "${INFLUX_TOKEN}"
  organization = "${INFLUX_ORG}"
  bucket = "${INFLUX_BUCKET}"

Deploy Telegraf

Create ~/telegraf/compose.yaml:

services:
  telegraf:
    image: telegraf:latest
    restart: unless-stopped
    volumes:
      - ./telegraf.conf:/etc/telegraf/telegraf.conf:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /:/hostfs:ro
    environment:
      HOST_ETC: /hostfs/etc
      HOST_PROC: /hostfs/proc
      HOST_SYS: /hostfs/sys
      HOST_VAR: /hostfs/var
      HOST_RUN: /hostfs/run
      HOST_MOUNT_PREFIX: /hostfs

Start Telegraf:

cd ~/telegraf
sudo docker compose up -d

The Docker socket mount (/var/run/docker.sock) allows Telegraf to collect container metrics. The /hostfs mount gives it access to host-level system stats from inside the container.

If you prefer running Telegraf directly on the host (useful for non-Docker servers):

# Add the InfluxData repository
curl -s https://repos.influxdata.com/influxdata-archive.key | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/influxdata.gpg
echo "deb https://repos.influxdata.com/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/influxdata.list

# Install Telegraf
sudo apt update
sudo apt install -y telegraf

Copy your telegraf.conf to /etc/telegraf/telegraf.conf, then:

sudo systemctl enable telegraf
sudo systemctl start telegraf

Verify data is flowing

After a minute, go back to InfluxDB (http://<your-server-ip>:8086), navigate to Data Explorer, and query the telegraf bucket. You should see metrics like cpu, mem, disk, and docker_container_cpu appearing.

4. Create your first dashboard

Import a community dashboard

The quickest path to a working dashboard:

In Grafana, go to Dashboards → New → Import
Enter dashboard ID 928 (the Telegraf System Dashboard)
Select your InfluxDB data source
Click Import

This gets you CPU, memory, disk, and network panels out of the box.

Build a custom panel

To create your own visualization:

Go to Dashboards → New → New Dashboard → Add visualization
Select your InfluxDB data source
Enter a Flux query. For example, CPU usage per host:

from(bucket: "telegraf")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "cpu")
  |> filter(fn: (r) => r._field == "usage_idle")
  |> filter(fn: (r) => r.cpu == "cpu-total")
  |> map(fn: (r) => ({r with _value: 100.0 - r._value}))
  |> aggregateWindow(every: 1m, fn: mean)

Set the panel title to “CPU Usage (%)” and choose a Time series visualization
Click Apply

Alerting

Grafana has built-in alerting — set thresholds (CPU > 90% for 5 minutes, disk < 10% free) and route notifications to email, Slack, Discord, or webhooks under Alerting → Contact points.

5. Secure remote access Optional

If you need to access Grafana dashboards remotely, don’t expose port 3000 directly to the internet.

Recommended options:

Reverse proxy with HTTPS — put Grafana behind Caddy or nginx with a TLS certificate
Cloudflare Tunnel — publish Grafana without opening inbound ports (see our Hudu self-hosting guide for the tunnel setup pattern, the same approach works for here)
VPN / Tailscale — access your monitoring server over a private network

Bonus: solar inverter monitoring with Grott Advanced

If you have a Growatt solar inverter, you can feed its production data into the same InfluxDB + Grafana stack using Grott.

Grott intercepts data from your Growatt inverter’s monitoring dongle (which normally phones home to Growatt’s cloud) and forwards it locally — to InfluxDB, MQTT, or Home Assistant.

The basic setup:

Run Grott as a Docker container on the same network as InfluxDB
Configure your Growatt datalogger to point to Grott’s IP instead of Growatt’s servers (or use Grott in proxy mode to forward data to both)
Configure Grott’s InfluxDB output to push to your telegraf bucket (or a dedicated solar bucket)
Build a Grafana dashboard showing solar production, consumption, and grid export

This is a niche setup (although we had some client requests), but if you have a Growatt system it’s a great way to unify all your monitoring, server infrastructure and energy production, in one place. Search for “Growatt Grafana dashboard” in the Grafana dashboard library for community examples.

What’s next

More Telegraf inputs — plugins for PostgreSQL, Redis, NGINX, SNMP, and hundreds more
Custom dashboards — Docker container stats, per-application metrics, network monitoring
Alerting rules — get notified before small problems become outages
Longer retention — adjust InfluxDB retention policies based on your storage

If you’re managing Docker across multiple servers, Komodo can help deploy and manage Telegraf containers at scale — and automate the whole workflow with scheduled Procedures:

Managing Docker Across Multiple Servers with Komodo

Deploy Komodo Core with Docker Compose and install Periphery agents on remote servers to manage Docker containers, stacks, and builds from a single dashboard.

Automating Docker with Komodo — Builds, Syncs, and Procedures

Use Komodo's Resource Syncs for GitOps, Procedures for automated workflows, Builds for CI/CD pipelines, and the CLI for headless Docker management.

Secure remote access to dashboards via Cloudflare Tunnel:

Self-Hosting Hudu with Docker & Cloudflare Tunnel

A practical guide to installing Hudu with Docker Compose, maintaining it over time, and securely publishing it via Cloudflare Tunnel.

Monitoring also means watching your backup jobs. If Veeam backups are completing with warnings instead of success, this guide walks through a real troubleshooting session:

Fixing "SQL VSS Writer Is Missing" in Veeam Backup

A trailing space in a SQL Server database name can silently break VSS writer registration for every database. Here's how to find it, fix it, and prevent it.