Two-NS setup: no auto backfill when offline DNS node comes back

visualweb

TL;DR: In a two–DNS-node cluster, I created a new sub-zone while one DNS VM was offline. When that DNS VM came back online, it didn’t receive the zone automatically. It only synced after I made a tiny edit to that exact zone (editing other zones didn’t help). Is this expected? Is there a way to backfill zones to a node that missed the original deploy without logging into the VM or touching records?

Setup / topology

Control panel (master): 1 VM (public IP: PANEL_IP)
DNS nodes: 2 dedicated VMs in different locations
DNS-1 (DNS1_IP) — Location A
DNS-2 (DNS2_IP) — Location B
Both DNS nodes run PowerDNS Authoritative via the Enhance DNS role.

UFW (or equivalent) on both DNS nodes:

ALLOW IN 53/udp, 53/tcp (Anywhere)
LIMIT IN SSH_PORT/tcp (custom SSH port)
ALLOW IN from PANEL_IP (all ports, or at least the Enhance ports)
ALLOW IN from peer DNS IP (all ports)
Provider firewall mirrors the same allows.
Enhance agents/services running (appd, appcd, filerd).

On DNS nodes, only TCP 50000 is listening (expected for DNS-only role):
ss -ltnp | egrep '🙁50000|50003|50004)\s'
LISTEN 0 128 0.0.0.0:50000 0.0.0.0:* users🙁("appcd",...))
50003–50004 show connection refused (no listener), which seems normal for DNS-only nodes.

What I did

Took DNS-1 offline intentionally to test redundancy.
While DNS-1 was offline, I created a new sub-zone: dim.staging.example.tld.
Confirmed on DNS-2 that the zone deployed and answered authoritatively.
Brought DNS-1 back online.

What I expected
When DNS-1 comes back online, I expected it to automatically receive any zones created while it was down (some type of backfill/replay), without manual intervention or VM access.

What actually happened
On DNS-1, the new zone was missing:
pdnsutil list-all-zones | grep dim.staging.example.tld # (no result)
dig @127.0.0.1 dim.staging.example.tld SOA +norecurse +auth
;; -> NXDOMAIN (authoritative for parent only)

I intentionally did not restart agents or reload PDNS, because in a real incident I’d like recovery without logging into the VM.
The zone only appeared after I edited that specific zone from the panel (e.g., add/change a TXT).
Editing a different zone did not trigger a deploy for the missing zone.

Workaround that worked
“Touch” the exact zone (e.g., add/edit a TXT or change a TTL) to trigger a per-zone deploy to the node that missed it.

Questions for the Enhance team
Is this behavior expected?
i.e., per-zone deploy events only, and no automatic backfill to nodes that were offline during the original create/update?

Is there a recommended way to backfill all zones to a node that missed deploys without logging into the VM or modifying records (e.g., a “Resync all zones to this node” action in UI/CLI, or a “Redeploy zone” button that doesn’t change data)?

Docs clarification:
If the intended model is “panel pushes per zone; no peer-to-peer; no auto-backfill,” could this be stated explicitly, along with a suggested operational procedure after a DNS node returns (e.g., a bulk redeploy action)?

Nice-to-have: a built-in report for zone presence/serial mismatches across DNS nodes, so operators can quickly see which zones need redeploy after downtime.

Extra (separate observation)

intoDNS flags SOA EXPIRE = 86400 as too low. I’m adjusting to 1209600 (14d) per zone via SOA edit. I couldn’t find a global SOA default in DNS templates; if/when that becomes configurable, it would save a lot of manual work.

Thanks! If there’s an existing “resync/backfill” mechanism I missed, I’d love pointers.

Kosta

XN-Matt

Raised to support to ask?