A DevOps agency hosted on a drag-and-drop website builder is a contradiction. So this site is the opposite: a small, real piece of infrastructure that’s provisioned, configured and deployed entirely as code. This post is the tour.
The shape of it
Three layers, each owning one job:
- Terraform creates the server, firewall and SSH key on Hetzner.
- Ansible turns a bare Ubuntu box into a hardened web server.
- GitHub Actions orchestrates both — infra on manual dispatch, the site on every push.
Cloudflare sits in front for TLS, CDN and WAF. The origin presents a Cloudflare Origin certificate so the edge can run in Full (Strict) mode.
Why split infra and deploy into two workflows
Provisioning is destructive and costs money; deploying a static site is cheap and frequent. Running them on the same trigger is asking for an accidental terraform destroy on a typo. So:
infra.ymlruns only onworkflow_dispatch, with anapply/destroychoice.deploy.ymlruns on push tomainwhen anything undersite/changes.
State lives as a GitHub Actions artifact between runs. That’s deliberately simple — single operator, low change rate. The moment a second person touches this, it moves to a real remote backend.
The SSH-port flip that bites everyone
The hardening role moves SSH off port 22. The catch: Ansible is connected on 22 when it does this. Restart sshd and the playbook loses its own connection mid-run.
The fix is to restart, then tell Ansible to reconnect on the new port for the rest of the play:
- name: Flush handlers so sshd restarts now
ansible.builtin.meta: flush_handlers
- name: Switch connection to the new port
ansible.builtin.set_fact:
ansible_port: "{{ ssh_port }}"
- name: Wait for SSH on the new port
ansible.builtin.wait_for_connection:
delay: 3
timeout: 60
Secrets
Nothing sensitive lives in the repo. The Hetzner API token, the SSH private key, and the Cloudflare Origin cert + key are all GitHub secrets, passed to Terraform and Ansible at runtime. The Ansible tasks that write the cert use no_log: true so the key never lands in a log.
What I’d change at scale
This is right-sized for one operator and a static site. With a team I’d move Terraform state to object storage with locking, add a plan/apply approval gate, and put the deploy behind an environment protection rule. But shipping the simple version first — and writing it down — beats designing for a scale you don’t have yet.
If you want the role-by-role breakdown, the next posts go deeper on the Nginx + Cloudflare Origin setup and on fail2ban.