I wanted nice graphs for the various metrics I collect in my homelab and from my Home Assistant server.
Here’s how I installed Prometheus and Grafana in my homelab to get them.
Prerequisites
- Server running
docker
,podman
orcontainerd
Install Instructions
For the sake of brevity, I’m not going to go into detail on how to make it accessible via https. If you do want to secure these containers with SSL, I’ve documented how to use Nginx Proxy Manager as an SSL proxy here.
Prometheus
Here’s the docker-compose.yaml
I’m using to start prometheus. We’re specifying a local directory to map in as /prometheus
so we don’t lose all our history every time we restart the container. I prefer to use a local directory instead of a named volume to make it easier to back up data from my prometheus server to b2.
version: '3'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
# Uncomment if you want prometheus accessible outside the docker network
# created by docker-compose
# ports:
# - "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./prometheus/data:/prometheus
- /etc/hostname:/etc/hostname:ro
- /etc/localtime:/etc/localtime:ro
- /etc/machine-id:/etc/machine-id:ro
- /etc/timezone:/etc/timezone:ro
restart: unless-stopped
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- '--web.enable-lifecycle'
Don’t start prometheus yet - we want to get metric collection running on some servers first so it has something to ingest.
Collect metrics on your servers
If you don’t put data into Prometheus, Grafana will have nothing to graph.
All of my homelab machines run docker
to provide services, so in addition to node_exporter
, I run a cadvisor
container to collect container data.
Here’s the docker-compose.yaml
I run on all my homelab instances. I mount several of the host machine’s interesting directory trees as read-only to allow node_exporter
and cadvisor
to create useful metrics for prometheus & grafana.
version: '3'
services:
node_exporter:
image: quay.io/prometheus/node-exporter:latest
container_name: node_exporter
command:
- '--path.rootfs=/host'
- '--collector.textfile.directory=/promfiles'
pid: host
ports:
- 9100:9100
restart: unless-stopped
volumes:
- '/:/host:ro,rslave'
- /etc/hostname:/etc/hostname:ro
- /etc/localtime:/etc/localtime:ro
- /etc/machine-id:/etc/machine-id:ro
- /etc/timezone:/etc/timezone:ro
- /etc/node_exporter/promfiles:/promfiles:ro
- /run/udev/data:/run/udev/data:ro
- /sys:/sys:ro
cadvisor:
image: zcube/cadvisor:v0.45.0
container_name: cadvisor
ports:
- "9119:8080"
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
- /etc/hostname:/etc/hostname:ro
- /etc/localtime:/etc/localtime:ro
- /etc/machine-id:/etc/machine-id:ro
- /etc/timezone:/etc/timezone:ro
devices:
- /dev/kmsg
restart: unless-stopped
I like to keep that as a separate docker-compose.yaml
file than the other ones on a given server to make it easier to keep the metrics collection consistent across my servers.
Put that file into a prometheus_collector
directory, then run it with docker-compose up -d
.
You can confirm that the exporters are running correctly with
# node exporter
curl http://localhost:9100/metrics
# cadvisor
curl http://localhost:9119/metrics
You should see a lot of text that looks similar to this snippet:
# A lot of lines snipped
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 17
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 1.38641408e+08
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.69881316144e+09
# More snipped lines
Ingest server metrics into prometheus
Prometheus stores its configuration in prometheus.yml
. This example configuration assumes you’re running node_exporter
and cadvisor
on the same server you’re running prometheus on.
global:
scrape_interval: 15s # Scrape targets every 15 seconds unless otherwise specified
query_log_file: /prometheus/data/query.log
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
# external_labels:
# monitor: 'codelab-monitor'
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
# Example job for node_exporter
- job_name: 'node_exporter'
scrape_interval: 10s
static_configs:
- targets:
- 'localhost:9100' # Assumes we're running node_exporter on our prometheus server
- 'foo.example.com:9100'
- 'bar.example.com:9100'
# Example job for cadvisor
- job_name: 'cadvisor'
scrape_interval: 10s
static_configs:
- targets:
- 'localhost:8080'
- 'foo.example.com:9119'
- 'bar.example.com:9119'
Grafana
Now that prometheus is working and collecting data, time to visualize the metrics with Grafana.
Add the following service stanza to your docker-compose.yaml
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
volumes:
- ./grafana/data:/var/lib/grafana
- /etc/hostname:/etc/hostname:ro
- /etc/localtime:/etc/localtime:ro
- /etc/machine-id:/etc/machine-id:ro
- /etc/timezone:/etc/timezone:ro
restart: unless-stopped
environment:
- GF_INSTALL_PLUGINS=grafana-clock-panel,natel-discrete-panel,grafana-piechart-panel
Configure Grafana
You can now log into grafana at http://hostname:3000
with username admin, password admin. You’ll be prompted to set a new admin password.
Now, connect grafana to your prometheus. Click Connections -> Add new connection on the left side of the window, then search for prometheus, then add prometheus (not prometheus alertmanager!)
Set the connection to http://prometheus:9090
- since you’re running prometheus in the same docker network created by docker-compose, you can refer to it by its container_name
field. That’s also why we didn’t specify ports in the prometheus docker-compose stanza - we don’t want it accessed by anything but grafana, and any containers running in the same docker network can access any ports on other containers in that network.
Add Dashboards
There are a multitude of dashboards to download at grafana.com/grafana/dashboards.
Let’s start by adding the node-exporter-full dashboard. Go to the page and copy it’s dashboard ID (1860 as of this post). Then go back to your local grafana, click the blue New button on the right hand side, and select Import.
Enter the dashboard ID and then click Load. It’ll present you a new dialog, click the Prometheus box and select the default Prometheus, and Import.
You’ll see a nice new dashboard similar to this
Repeat for the docker dashboard.
Grafana will automatically populate the host menu with all the hosts your prometheus is scraping.
There are many, many dashboards you can experiment with, and of course you can create your own.
Finally
Now you have a working prometheus and grafana stack you can use to monitor the machines in your homelab.
Combined YAML
Here’s the combined docker-compose.yaml
to start prometheus
and grafana
.
version: '3'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
# Uncomment only if you want access without the SSL proxy
# ports:
# - "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./prometheus/data:/prometheus
- /etc/hostname:/etc/hostname:ro
- /etc/localtime:/etc/localtime:ro
- /etc/machine-id:/etc/machine-id:ro
- /etc/timezone:/etc/timezone:ro
restart: unless-stopped
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- '--web.enable-lifecycle'
grafana:
image: grafana/grafana:latest
container_name: grafana
# Uncomment only if you want access without the SSL proxy
# ports:
# - "3000:3000"
volumes:
- ./grafana/data:/var/lib/grafana
- /etc/hostname:/etc/hostname:ro
- /etc/localtime:/etc/localtime:ro
- /etc/machine-id:/etc/machine-id:ro
- /etc/timezone:/etc/timezone:ro
restart: unless-stopped
environment:
- GF_INSTALL_PLUGINS=grafana-clock-panel,natel-discrete-panel,grafana-piechart-panel
# If you aren't using SSL you can remove this section
networks:
default:
external:
name: ssl_proxy_network
PS - Enabling SSL
I go into a lot more detail on using nginx-proxy-manager as an SSL proxy here, but the TL;DR is:
- You need to own a domain
- If you haven’t already done so, create an
nginx
ssl proxy using the blog post instructions - Create new CNAME entries in your domain for
grafana.yourdomain.com
andprometheus.yourdomain.com
that point at the docker server you’re hosting prometheus and grafana on - Add the
ssl_proxy network
to the end of yourdocker-compose.yaml
file
networks:
default:
external:
name: ssl_proxy_network
- Delete or comment out all the port stanzas from
docker-compose.yaml
. This will prevent the containers from being accessed except through the SSL proxy - Add a proxy host to your
nginx-proxy-manager
forgrafana.yourdomain.com
and set its destination tohttp://grafana:3000
- Add a proxy host to
nginx-proxy-manager
forprometheus.yourdomain.com
and set its destination tohttp://prometheus:9090
. Prometheus doesn’t have authentication built in, so add a basic auth to it innginx-proxy-manager
(read the other blog post for details)