HAProxy's native ACME requires a separate verification playbook
A founder's deep-dive into HAProxy's ACME certificate renewal reveals a critical gap. A successful API call does not guarantee the correct certificate is live in production. A developer running their…
A founder's deep-dive into HAProxy's ACME certificate renewal reveals a critical gap. A successful API call does not guarantee the correct certificate is live in production.
A developer running their own infrastructure reported a successful HAProxy ACME certificate renewal. The logs showed success. The API calls completed. Yet the load balancer continued serving an old, or sometimes even a self-signed, certificate to users. The experience, detailed in a technical write-up, produced a critical lesson for any founder managing their own edge: a successful renewal process and a correctly configured production server are two different states that must be verified independently.
What they did
The playbook addresses a silent failure mode in HAProxy's native ACME client. The founder, posting under the handle SuccessFearless2102, identified that HAProxy can successfully fetch a new certificate from an ACME provider, but will silently discard it if the new certificate's public key does not match the private key used to generate the Certificate Signing Request (CSR). This can happen if an intermediate system, like a certificate vault, serves a pre-provisioned certificate instead of one freshly signed from the submitted CSR. The result is a system that appears healthy in logs but fails in production.
Splitting roles across three PEM files
To manage state correctly, the founder reports using a three-file system for certificates and keys. Each file has a distinct role:
bootstrap.pem: A placeholder, potentially self-signed, certificate that exists only to allow the HAProxy process to start and bind to port 443 before the real certificate is available.acme.pem: Contains the private key that HAProxy is configured to reuse for generating new CSRs. This ensures key consistency.live.pem: The production certificate and key bundle that is actually bound to the public-facing TLS listener.
This separation prevents conflicts between the initial startup requirements, the ACME renewal process, and the live serving configuration.
An explicit startup and promotion sequence
The process for deploying a new certificate is not a single command. It is an eight-step sequence that treats certificate promotion as a distinct, verifiable deployment step.
- Seed the ACME private key before HAProxy starts.
- Validate that the staged key and certificate public key match.
- Start HAProxy, pointing its public listener to
live.pem. - Trigger the ACME renewal process via HAProxy's admin socket.
- Allow HAProxy to complete the flow and stage the new certificate material.
- Independently verify the staged key and certificate match.
- Promote the verified bundle, copying it to become
live.pem. - Probe the public TLS endpoint to confirm it is serving the newly promoted certificate.
Verifying what is actually served
The core of the playbook is direct verification using command-line tools. The founder provided three specific commands to query the state of the system from different perspectives.
To check if the new certificate is loaded into HAProxy's memory store, they use socat to query the admin socket: echo "show ssl cert @cert-store/name" | socat stdio /var/run/haproxy/admin.sock.
To check the certificate file HAProxy has loaded from disk for a specific listener: echo "show ssl cert /path/to/live.pem" | socat stdio /var/run/haproxy/admin.sock.
Finally, to verify what a public user receives, they connect from the outside using OpenSSL's client: s_client -connect 127.0.0.1:443 -servername your.domain.com. This command directly inspects the certificate served over the TLS connection, providing the ultimate source of truth.
What we'd change
This is a robust playbook for a specific context: self-managed infrastructure at a scale where direct control over the load balancer is necessary. For most early-stage products, this complexity is a liability. Automated solutions like the Caddy web server, or platforms-as-a-service like Fly.io and Vercel, handle certificate issuance and renewal transparently. Adopting this HAProxy playbook is a decision to trade engineering hours for granular control, a trade-off that is rarely favorable in the first few years of a product's life.
The verification steps are presented as manual diagnostic commands. A production-grade implementation would not rely on an operator running shell commands. The process should be automated. A script should run the s_client command, parse its output for the certificate's serial number and expiration date, and compare it against the expected values. If there is a mismatch, or if the check fails, the script should trigger a high-priority alert. This transforms the founder's diagnostic tool into a reliable, automated monitoring system.
Landing
The file on disk is less important than what the process has loaded and what clients receive over TLS. This principle is the foundation of reliable infrastructure automation. Any system that modifies critical state, from TLS certificates to DNS records, requires an independent verification loop. A 'success' status from an internal process is an intermediate signal, not a final confirmation. The only reliable measure of success is a direct probe of the public-facing service, confirming it behaves exactly as expected.
The investor read
This level of infrastructure management signals deep technical capability within a team. It can also be a red flag for premature optimization. For an investor, the key question is whether this work builds a competitive moat or is a distraction from product development. For a high-performance API company or a business operating at massive scale, custom edge infrastructure can be a significant cost and performance advantage. For most SaaS startups, it represents engineering resources spent on a solved problem. When evaluating a company demonstrating this expertise, an investor should probe whether the strategy is a deliberate choice to build a defensible advantage or a default habit of over-engineering.
Pull quote: “The file on disk is less important than what the process has loaded and what clients receive over TLS.”
Every claim ties to a primary source. See our methodology.