Saturday, 14 June 2025

Recording Rules in Prometheus

What are Recording Rules in Prometheus?

Recording Rules in Prometheus help you precompute frequently used or complex queries and store the results as new time series. This makes your dashboards and alerts much faster because Prometheus doesn’t need to recalculate everything every time.

 

You can think of Recording Rules like materialized views in databases such as BigQuery, they store the result of a query so it can be reused quickly.

 

Types of Rules in Prometheus

Prometheus supports two types of rules:

 

·      Recording Rules: Precompute and save the results of a PromQL query.

·      Alerting Rules: Trigger alerts based on PromQL expressions.

 

Both types are defined in YAML and evaluated at regular intervals.

 

How to Create and Use Recording Rules?

·      Create a YAML file, for example recording_rules.yaml.

·      In this file, define the PromQL expressions you want to precompute. Give each rule a unique name so you can refer to it later.

·      Add the rule file to your Prometheus config under the rule_files section.

·      Restart or reload Prometheus so it can pick up the new rules.

 

Example Rule

Here’s a simple rule that calculates the average CPU idle time over the last 5 minutes:

groups:
  - name: cpu_idle_rules
    rules:
      - record: idlemode:node_cpu_seconds_total:avg_rate5m
        expr: avg without(cpu) (rate(node_cpu_seconds_total{mode="idle"}[5m]))

This will create a new time series with the name idlemode:node_cpu_seconds_total:avg_rate5m that you can query just like any other metric.

 

Naming Convention for Recording Rules

Use this format for rule names: <level>:<metric>:<operations>

 

·      level: Aggregation level, e.g., job, idlemode

·      metric: The original metric name, e.g., node_cpu_seconds_total

·      operations: What operations were applied, newest operation first, e.g., avg_rate5m

 

Example: idlemode:node_cpu_seconds_total:avg_rate5m

 

Why to use Recording Rules?

·      Speed: Reusing precomputed data is faster than recalculating on the fly.

·      Efficiency: Reduces the load on your Prometheus server during heavy dashboard queries or alert evaluations.

 

Step-by-Step Guide to Create and Use Recording Rules in Prometheus for Per-CPU Metrics

Let's create 4 recording rules to calculate the 5-minute average CPU usage rates for CPU 0, 1, 2, and 3.

 

Step 1: Create a Recording Rules YAML File

Create a new file named cpu_recording_rules.yaml.

 

Inside this file, define your rules like this:

 

cpu_recording_rules.yaml

groups:
  - name: per_cpu_avg_rate
    interval: 1m
    rules:
      - record: cpu0:node_cpu_seconds_total:avg_rate5m
        expr: avg(rate(node_cpu_seconds_total{cpu="0"}[5m]))
      - record: cpu1:node_cpu_seconds_total:avg_rate5m
        expr: avg(rate(node_cpu_seconds_total{cpu="1"}[5m]))
      - record: cpu2:node_cpu_seconds_total:avg_rate5m
        expr: avg(rate(node_cpu_seconds_total{cpu="2"}[5m]))
      - record: cpu3:node_cpu_seconds_total:avg_rate5m
        expr: avg(rate(node_cpu_seconds_total{cpu="3"}[5m]))

Save this file in the same directory as your Prometheus configuration (e.g., /etc/prometheus/ or the directory used in your Docker setup).

 

Step 2: Add the Rules File to prometheus.yml

Open your prometheus.yml configuration file and add the rule file under the rule_files section.

 

Refer template file here (https://github.com/prometheus/prometheus/blob/main/documentation/examples/prometheus.yml)

 

prometheus.yml

global:
  scrape_interval: 15s

rule_files:
  - "cpu_recording_rules.yaml"
  
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
  - job_name: 'node_exporter'
    static_configs:
      - targets: ['localhost:9100']

 

Step 3: Restart Prometheus

Now, restart Prometheus so that it can pick up the new rules.

 

How to see the configured rules?

Navigate to Prometheus UI (http://localhost:9090/).

 

Status -> Rule health. 


 

You will be taken to the rules page.

 


You’ll see the group name per_cpu_avg_rate and under it the 4 recording rules you created:

 

·      cpu0:node_cpu_seconds_total:avg_rate5m

·      cpu1:node_cpu_seconds_total:avg_rate5m

·      cpu2:node_cpu_seconds_total:avg_rate5m

·      cpu3:node_cpu_seconds_total:avg_rate5m

 

Click on any rule. It expands with the rule expression.

 


Click on the search button available at right side the rule expression.

 

You will be taken to Query landing page and it shows the result of the query as well.

 


 

How frequently recording rules are executed?

By default, all rules (both recording and alerting) are evaluated at an interval defined in the Prometheus server config (prometheus.yml), but you can override it for each group of rules.

 

In prometheus.yml, you can set the global evaluation interval like this: 

global:
  evaluation_interval: 1m  # Default: 1 minute

 

This means all rules (recording and alerting) are evaluated every 1 minute unless overridden.

 

Custom Evaluation Interval per Group

You can set a custom evaluation interval for a specific rule group in your rule file. For example:

groups:
  - name: per_cpu_avg_rate
    interval: 2m  # Overrides global interval, runs every 2 minutes
    rules:
      - record: cpu0:node_cpu_seconds_total:avg_rate5m
        expr: avg(rate(node_cpu_seconds_total{cpu="0"}[5m]))

 

In this case, only this group (per_cpu_avg_rate) of rules will be evaluated every 2 minutes, regardless of the global setting.

Validate Your Prometheus Rules Using promtool on macOS

When working with Prometheus, you often write custom recording and alerting rules in YAML files. However, even a small formatting or syntax mistake in these rule files can prevent Prometheus from starting or evaluating your rules correctly. That’s where promtool comes in!

 

In this post, you’ll learn:

 

·      What promtool is

·      How to install it on macOS

·      How to use it to validate your Prometheus rule files

 

1. What is promtool?

promtool is a command-line utility that comes with Prometheus. It helps you:

 

·      Check if your rule files (YAML) are valid

·      Verify PromQL queries

·      Test alert rules

·      Debug time series data and more

 

In short, it’s your go-to tool to make sure your Prometheus configuration is solid before you reload or restart the server.

 

2. Installing promtool on macOS

There are two main ways to install promtool on macOS:

 

2.1 Using brew

If you have Homebrew installed, open terminal and execute following command.

brew install prometheus

This installs both the Prometheus server and promtool.

 

You can verify it with by executing below command.

promtool --version

 

You should see something like:

$promtool --version
promtool, version 3.2.1 (branch: non-git, revision: non-git)
  build user:       reproducible@reproducible
  build date:       20250225-19:11:52
  go version:       go1.24.0
  platform:         darwin/arm64
  tags:             netgo,builtinassets,stringlabels

 

2.2 Download Binary from Prometheus Website

 

Step 1: Go to Prometheus downloads page (https://prometheus.io/download/)

 

Step 2: Download the latest macOS tarball

 

Step 3: Extract it:

tar -xvzf prometheus-*.tar.gz

cd prometheus-*

 

You can see a promtool command available here.

 

3. How to Use promtool to Validate Rule Files?

 

Example rule file: cpu_recording_rules.yaml

 

cpu_recording_rules.yaml

groups:
  - name: per_cpu_avg_rate
    interval: 1m
    rules:
      - record: cpu0:node_cpu_seconds_total:avg_rate5m
        expr: avg(rate(node_cpu_seconds_total{cpu="0"}[5m]))
      - record: cpu1:node_cpu_seconds_total:avg_rate5m
        expr: avg(rate(node_cpu_seconds_total{cpu="1"}[5m]))
      - record: cpu2:node_cpu_seconds_total:avg_rate5m
        expr: avg(rate(node_cpu_seconds_total{cpu="2"}[5m]))
      - record: cpu3:node_cpu_seconds_total:avg_rate5m
        expr: avg(rate(node_cpu_seconds_total{cpu="3"}[5m]))

Validate it by executing below command.

promtool check rules cpu_recording_rules.yaml

$promtool check rules cpu_recording_rules.yaml
Checking cpu_recording_rules.yaml
  SUCCESS: 4 rules found

Since the file is a valid rule file, we can’t see any errors.

 

Let’s create a rule file with some indentation errors.

 

improper_indentation.yml

groups:
  - name: per_cpu_avg_rate
    interval: 1m
     rules:
      - record: cpu0:node_cpu_seconds_total:avg_rate5m
        expr: avg(rate(node_cpu_seconds_total{cpu="0"}[5m]))

Now, when you check the rules, you will see below error.

$promtool check rules improper_indentation.yml 
Checking improper_indentation.yml
  FAILED:
improper_indentation.yml: yaml: line 4: mapping values are not allowed in this context
improper_indentation.yml: yaml: line 4: mapping values are not allowed in this context

That’s it….Happy learning…


 

Previous                                                    Next                                                    Home

No comments:

Post a Comment