Promotional graphic: "From 5 Clouds to 2" with cloud and devices icon. Text: "How we cut infra costs by 70% without losing compliance."

From 5 Clouds to 2: How We Cut Infra Cost and Complexity by 70% Without Sacrificing Compliance

🛠 From 5 Clouds to 2: How We Cut Infra Cost and Complexity by 70% Without Sacrificing Compliance

We recently helped a customer who had built up a complex web of infrastructure across five different cloud environments — including Azure, AWS, Hetzner, and Open Telecom Cloud. Their stack supported an AI-powered chatbot platform using LLMs and vector databases, designed for regulated industries like finance and healthcare.

The stack worked — technically — but it came at a cost:
  • 💸 $20,000+ per month in infrastructure spend

  • 👷 A 4-person DevOps team stuck in constant maintenance mode

  • 🧱 A monolithic Terraform codebase with hardcoded conditions for each cloud

  • No real portability, no simplicity, and fragile reliability

By the end of our engagement, we had:
  • Consolidated from 5 clouds to just Azure and Hetzner

  • Dropped monthly cloud spend by over 70%

  • Cut the ops team in half — without losing capacity

  • Delivered a modular, cloud-agnostic Terraform setup

  • Preserved compliance and security without vendor lock-in

The Infra Tangle: When “Multicloud” Becomes Multi-Mess

If you’ve ever browsed threads like “AWS Azure or Hetzner reddit” or “Hetzner Cloud reddit vs managed services”, you’ll often find the same tension: cost vs control vs compliance.

Our customer was stuck in the middle of that triangle.
They had:
  • Open Telecom Cloud running a legacy deployment on Nomad and Consul

  • Azure with Application Gateway and blob storage

  • Hetzner Cloud, used primarily for cost-saving

  • AWS, which was eventually phased out

  • One more stack sitting on-prem at a banking client’s data center

This wasn’t a planned multicloud strategy — it was organic sprawl, driven by client needs, team preferences, and short-term workarounds.

The Terraform codebase reflected that chaos. Every cloud had its own set of conditions, variables, and exceptions, crammed into a single spaghetti-like repository. It took ages to run, was hard to debug, and nearly impossible to upgrade safely.
Step 1: Costs First, Then Cleanup

We started with cost trimming:

  • Reserved instances

  • Removed unused replicas and resources

  • Tuned the deployment to match actual usage, not contractual overprovisioning

This alone dropped infra costs from $20k to ~$6k/month.

Next, we tackled cloud consolidation.

We removed AWS first, migrating workloads into either Azure or Hetzner, depending on business needs.

  • Azure was kept for compliance and trust — essential for working with banking clients who value enterprise credentials and regional certifications.

  • Hetzner stayed because of its unbeatable cost/performance ratio, especially on AMD EPYC instances.

In fact, if Hetzner didn’t exist, the customer would have been forced into multi-tenancy to stay within budget — which might have been a good architectural evolution. But the availability of cheap and performant single-tenant nodes made consolidation viable.

Step 2: Rancher + AKS, Not Multicloud Kubernetes
We standardised Kubernetes across all environments — but without trying to run a true multicloud cluster.
  • On Hetzner, we used Rancher to orchestrate K8s clusters.

  • On Azure, we used AKS (Azure Kubernetes Service) for the managed experience.

This “split but unified” approach gave us the best of both worlds:
  • Portability where possible

  • Managed services where useful

  • Cost control and compliance by design

Most of the core stack — PostgreSQL, MySQL, Elasticsearch, Keycloak, vector DBs, and the app stack (Java, Node, Python) — was cloud-agnostic.

Only a few components were truly cloud-specific.

Step 3: Load Balancers, WAF, and Vendor Lock Workarounds

One big challenge came from Azure’s Application Gateway, which was used for WAF and SSL termination.

It worked — until it didn’t.

The number of supported rules and domains in App Gateway was too low for a single-tenant deployment model. To work around this:

  • We provisioned multiple Application Gateways, grouped by tenant batches.

  • Explored alternatives like Cloudflare in front of Nginx Ingress Controllers — especially in Hetzner.

This turned out to be the best combo:
  • App Gateway was still needed for Azure compliance and security baseline

  • But Cloudflare+Ingress gave us the flexibility and scalability we needed for other tenants

This is where vendor lock-in showed up in the real world.

It wasn’t just about APIs or formats — it was the fact that scaling Azure WAF meant scaling cost, not capability.

If you’ve ever read a “AWS Azure or Hetzner vendor lock reddit” thread, this is what they’re talking about.

Step 4: Modular Terraform FTW
The most underrated improvement?
We took the monolithic Terraform codebase and:
  • Modularized it into reusable, cloud-agnostic components

  • Removed hundreds of lines of cloud-specific conditionals

  • Enabled each module to deploy cleanly to either Azure or Hetzner

That made the entire system:
  • Easier to upgrade

  • Simpler to onboard new team members

  • Future-proof for adding new tenants or clouds

Whether you’re managing object storage, IP address allocations, or even network topologies, modular Terraform is the only sane way to handle diverse environments.

The Final Picture
  • 2 Clouds: Azure for compliance, Hetzner for cost efficiency

  • 💰 Costs down: $20k → $6k/month

  • 🔐 Security: Cloudflare + App Gateway hybrid WAF

  • 🚀 Faster rollout: Modular IaC and unified deployment patterns

  • 🧘 Less stress: Ops team went from 4 → 2, others returned to feature work

Lessons Learned
  • Don’t do multicloud unless you absolutely must. Portability > parallelism.

  • Hetzner Cloud is unbeatable if you need low-cost, high-performance single-tenant infra.

  • Azure brings trust for regulated industries — but watch for creeping complexity and cost.

  • Avoid vendor lock by using open-source compatible services (PostgreSQL, Kafka, etc.).

  • Modular Terraform is the difference between scale and suffering.

  • When you hit a limitation (like App Gateway rule caps), get creative — Cloudflare saved us.

  • Focus on cloud-agnostic code, not cloud-agnostic runtime — it’s faster, simpler, and safer.

Whether you’re weighing AWS, Azure, or Hetzner — or reading about vendor lock and total cost of ownership on Reddit — remember:

Your infra should grow with your product. Not against it.