Post

DNS as Code with Cloudflare, Terraform, and GitHub Actions

DNS as Code with Cloudflare, Terraform, and GitHub Actions

DNS records are one of those things that quietly accumulates. A handful of manual entries in a web UI turns into dozens over a few years, half of them undocumented, and when something breaks you are left trying to piece together what changed and when by staring at whatever Cloudflare’s dashboard is showing. Deleting anything feels risky because nobody is sure what still depends on it.

The fix is treating DNS the same as any other infrastructure: version controlled, reviewed, and applied automatically. This post covers how I manage all my Cloudflare DNS records through a single YAML file, a small Python script, Terraform, and a GitHub Actions pipeline that runs on every merge to main.


The Goal

  • DNS records defined in a human-readable dns.yaml file
  • A Python script converts that YAML into Terraform-compatible tfvars.json
  • Terraform applies changes to Cloudflare via the official provider
  • GitHub Actions runs validate and plan on PRs, and applies on merge to main
  • The Cloudflare API token is pulled from Vault at runtime with no static secrets stored in GitHub

Part 1 - The DNS Manifest

Rather than writing Terraform resources directly, all records live in dns.yaml. The format is intentional: a zone name, its Cloudflare zone ID, and a flat list of records.

1
2
3
4
5
6
7
8
9
zones:
  yourdomain.com:
    zone_id: "abc123..."
    records:
      - "@  A  203.0.113.10"
      - "www  CNAME  yourdomain.github.io"
      - "@  TXT  v=spf1 include:_spf.protonmail.ch mx ~all"
      - {r: "@  MX  mail.protonmail.ch", priority: 10}
      - {r: "@  MX  mailsec.protonmail.ch", priority: 20}

Each record is a string in name TYPE value format. Records that need extra fields like priority use a dict with the raw record under r and any overrides alongside it. Everything else defaults to ttl: 1 (Cloudflare automatic) and proxied: false.

Your zone ID is on the Cloudflare dashboard under the domain overview. You need one per zone.


Part 2 - Converting YAML to tfvars

Terraform does not speak YAML, so a small Python script bridges the gap. It reads dns.yaml, parses each record into a structured dict, and writes the result to production.auto.tfvars.json where Terraform picks it up automatically on the next run.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
#!/usr/bin/env python3
import json
import sys
from pathlib import Path

try:
    import yaml
except ImportError:
    sys.exit("PyYAML is required: pip install pyyaml")

REPO_ROOT = Path(__file__).resolve().parents[1]
SOURCE = REPO_ROOT / "dns.yaml"
DEST = REPO_ROOT / "terraform" / "environment" / "production.auto.tfvars.json"

DEFAULTS = {"ttl": 1, "proxied": False}


def parse_record(entry: str | dict) -> dict:
    overrides = {}
    if isinstance(entry, dict):
        raw = entry.pop("r")
        overrides = entry
    else:
        raw = entry

    parts = raw.split(None, 2)
    if len(parts) != 3:
        raise ValueError(f"Invalid record (expected 'name TYPE value'): {raw!r}")

    name, rtype, value = parts
    return {"name": name, "type": rtype.upper(), "value": value, **DEFAULTS, **overrides}


def convert(source: Path, dest: Path) -> None:
    with source.open() as f:
        data = yaml.safe_load(f)

    cloudflare_records = {}
    for zone_name, zone in data["zones"].items():
        records = []
        for entry in zone["records"]:
            record = parse_record(entry)
            if record.get("priority") is None:
                record.pop("priority", None)
            records.append(record)

        cloudflare_records[zone_name] = {
            "zone_id": zone["zone_id"],
            "records": records,
        }

    output = {"cloudflare_records": cloudflare_records}
    dest.parent.mkdir(parents=True, exist_ok=True)
    with dest.open("w") as f:
        json.dump(output, f, indent=2)
        f.write("\n")

    total = sum(len(z["records"]) for z in cloudflare_records.values())
    print(f"{total} records across {len(cloudflare_records)} zones → {dest.relative_to(REPO_ROOT)}")


if __name__ == "__main__":
    convert(SOURCE, DEST)

One thing to watch: Cloudflare’s Terraform provider throws an error if you pass priority on a non-MX record. The script strips it out when it is not explicitly set rather than passing a null through.


Part 3 - Terraform

Provider and Backend

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
terraform {
  required_version = ">= 1.14.3"

  required_providers {
    cloudflare = {
      source  = "cloudflare/cloudflare"
      version = "~> 4.0"
    }
  }

  backend "s3" {
    bucket  = "your-tfstate-bucket"
    key     = "infrastructure/terraform.tfstate"
    region  = "ap-southeast-2"
    encrypt = true
  }
}

provider "cloudflare" {
  api_token = var.cloudflare_api_token
}

State lives in S3 so it survives across runners and does not need to be committed to the repo. Enable versioning on the bucket so you can roll back if a bad apply slips through.

Variables

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
variable "cloudflare_api_token" {
  description = "Cloudflare API token"
  type        = string
  sensitive   = true
}

variable "cloudflare_records" {
  description = "Map of Cloudflare zones to DNS records, generated from dns.yaml."
  type = map(object({
    zone_id = string
    records = list(object({
      name     = string
      type     = string
      value    = string
      ttl      = number
      proxied  = optional(bool, false)
      priority = optional(number)
    }))
  }))
}

Main

The main file flattens the nested structure into a single map keyed by a stable string that uniquely identifies each record. Terraform uses this key to track resources across applies so renaming a zone in the YAML does not silently destroy and recreate everything.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
locals {
  dns_records = {
    for item in flatten([
      for zone_name, zone in var.cloudflare_records : [
        for record in zone.records : {
          key      = "${zone_name}__${record.name}__${record.type}__${replace(record.value, ".", "_")}"
          zone_id  = zone.zone_id
          name     = record.name
          type     = record.type
          value    = record.value
          ttl      = record.ttl
          proxied  = record.proxied
          priority = record.priority
        }
      ]
    ]) : item.key => item
  }
}

resource "cloudflare_record" "dns" {
  for_each = local.dns_records

  zone_id  = each.value.zone_id
  name     = each.value.name
  type     = each.value.type
  value    = each.value.value
  ttl      = each.value.ttl
  proxied  = each.value.proxied
  priority = each.value.priority
}

The dots-replaced-with-underscores in the key is purely cosmetic since Terraform map keys cannot contain dots. Two records with identical name, type, and value would collide, but that would be broken DNS anyway.


Part 4 - GitHub Actions

The pipeline has three jobs: validate on every push, plan on pull requests with the output posted as a PR comment, and apply on merge to main.

The Cloudflare API token never touches GitHub Secrets. It is fetched from Vault at the start of each job using JWT/OIDC authentication, the same approach covered in my Vault post.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
name: Terraform - DNS

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]
  workflow_dispatch:

concurrency:
  group: terraform-dns
  cancel-in-progress: false

jobs:
  validate:
    runs-on: self-hosted
    permissions:
      contents: read
      id-token: write

    steps:
      - name: 🛒 Checkout
        uses: actions/checkout@v6

      - name: 🔐 Import secrets from Vault
        uses: hashicorp/vault-action@v4
        with:
          url: https://vault.yourdomain.com:8200
          method: jwt
          role: cloudflare-dns
          secrets: |
            secret/data/cloudflare api_token | TF_VAR_cloudflare_api_token

      - name: 🐍 Set up Python
        uses: actions/setup-python@v6
        with:
          python-version: "3.14"

      - name: 📦 Install PyYAML
        run: pip install pyyaml

      - name: 🗂️ Generate tfvars from dns.yaml
        run: python3 scripts/yaml_to_tfvars.py

      - name: 🔐 Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v6
        with:
          role-to-assume: arn:aws:iam::123456789012:role/TerraformStateAccess
          aws-region: ap-southeast-2

      - name: 🧰 Setup Terraform
        uses: hashicorp/setup-terraform@v4
        with:
          terraform_wrapper: false

      - name: 🚀 Terraform Init
        run: terraform init -input=false
        working-directory: terraform

      - name: 🎨 Format check
        run: terraform fmt -check -recursive
        working-directory: terraform

      - name: 🧪 Validate
        run: terraform validate
        working-directory: terraform

The plan job runs only on pull requests and posts the plan output as a comment, updating it in place if one already exists. Reviewers can see exactly what will change before they approve. No surprises on merge.

The apply job runs only on push to main and targets a production environment so you can add required reviewers or deployment protection rules in GitHub if you want a manual gate before apply.

cancel-in-progress: false on the concurrency group is important. You do not want two applies running simultaneously against the same Cloudflare zone. A cancelled half-finished apply would leave state and reality out of sync.


The Day-to-Day

Adding a new record is now:

1
2
3
4
5
6
7
# Edit dns.yaml and add your record
vim dns.yaml

git add dns.yaml
git commit -m "feat: add subdomain record for new service"
git push origin feature/add-subdomain
# open a PR, review the plan, merge

The plan comment on the PR shows exactly which records will be created, modified, or destroyed. Merging applies it. The audit trail is the git log.

Deleting a record works the same way. Remove it from the YAML and Terraform will plan a destroy. No more records that outlive their purpose because nobody was confident enough to touch them.

This post is licensed under CC BY 4.0 by the author.