← all posts

From Data Center to ECS/Fargate in 12 Months

May 2026

In early 2021, Micro Focus's Control Tower was the operational backbone of the SaaS business. Order fulfillment, provisioning, tenant management — every customer transaction passed through it. It had been running on over-provisioned data center VMs for years, and the accepted wisdom was that it couldn't run in containers. The code was fragile, the queries were expensive, and the database team was already stretched thin.

This is the story of how we proved the accepted wisdom wrong.

The Starting Point

Control Tower was a heritage Java workload. It had grown organically over a decade. The deployment model was manual: SSH into a VM, deploy the WAR file, restart Tomcat, pray. Monitoring meant checking a dashboard that someone had built and forgotten about. Scaling meant provisioning a bigger VM and hoping the migration didn't break anything.

The database layer was Oracle RAC running on bare metal. Queries that should have taken milliseconds were taking seconds. The DBA team was fighting fires daily and had no bandwidth for optimization work.

The architecture was monolithic. No service boundaries. No tenancy model. No CI/CD pipeline.

Why Containers Seemed Impossible

The arguments against containerization were reasonable:

  • The Java codebase had hardcoded paths and configuration files that assumed a specific filesystem layout
  • Session state was stored in-memory with no distributed caching
  • Several components depended on native libraries installed on the host OS
  • The team had never used Docker, let alone orchestration
  • Container networking seemed like black magic to a team used to static IPs
  • These weren't bad-faith objections. They were real technical debt that had accumulated over years of "we'll fix it later."

    The Strategy: GitOps First, Containers Second

    Instead of trying to containerize everything at once, we took a two-phase approach.

    Phase 1: GitOps. Before touching the runtime, we put everything in version control. Configuration, deployment scripts, environment variables, database migration scripts — everything that had lived on individual engineer's laptops went into Git. We stood up GitHub Actions runners and built CI pipelines that compiled, tested, and packaged the application. No runtime changes yet — just the ability to reproduce the build deterministically.

    This phase alone caught three configuration drift issues that had been silently causing production bugs.

    Phase 2: Containerization. With GitOps in place, we could iterate on the runtime without losing reproducibility. We built a Dockerfile that reproduced the Tomcat environment. We extracted hardcoded paths into environment variables. We moved session state to Redis ElastiCache. We replaced native library dependencies with Alpine packages.

    The key insight was not to optimize for the perfect container image, but for the fastest path to a working container. We could optimize later. First, we needed to prove it could run.

    The Architecture Decision: ECS Fargate over EKS

    We evaluated both ECS Fargate and EKS (Kubernetes on EC2). The team had strong opinions on both sides.

    Kubernetes advocates wanted the flexibility, the ecosystem, and the portability. The counterargument was operational complexity: we didn't have Kubernetes expertise on the team, and standing up an EKS cluster with all the supporting infrastructure (ingress controllers, service mesh, monitoring, logging) would take months.

    We chose ECS Fargate for three reasons:

    1. Serverless operations — no cluster management, no node patching, no capacity planning

    2. Familiar API — the task definition concept maps reasonably to how the team thought about application components

    3. Faster time-to-value — we could have a working container in production in weeks, not months

    The Firecracker microVM technology that powers Fargate was the secret sauce. Each task runs in its own lightweight VM, providing the isolation of a VM with the efficiency of a container. No noisy neighbors. No "but containers aren't secure enough" objections from security.

    The Migration

    The actual migration took about eight weeks from first container build to production traffic.

    Week 1-2: Containerized the application, built the CI/CD pipeline, created the ECS task definitions and service definitions in Terraform.

    Week 3-4: Set up the networking — VPC, subnets, security groups, load balancer, service discovery. Migrated the RDS instance from Oracle to PostgreSQL Aurora (the database team found capacity for this).

    Week 5-6: Load testing and optimization. The first load test failed at 50 concurrent users. The bottleneck was database connection pooling — the application was opening a new connection per request. We added HikariCP connection pooling and reran. 500 concurrent users passed. 1,000 concurrent users passed.

    Week 7-8: Cutover. We used a blue/green deployment pattern via CodeDeploy. Traffic shifted incrementally: 10% for one day, 50% for two days, 100% after a week of monitoring.

    The Result

    The migration transformed the operational profile of Control Tower:

    Metric Before (DC VM) After (ECS Fargate)
    Deployment time 2-4 hours (manual) 8 minutes (automated)
    Scaling Manual (days) Automatic (seconds)
    Cost Fixed (over-provisioned) Variable (pay-per-use)
    Availability 99.5% 99.95%
    MTTR 4-6 hours 30 minutes
    Test environment Shared, frequently broken Ephemeral per-PR

    Cost dropped by approximately 40% because we were no longer paying for idle capacity. The application was now elastic — it scaled down to near-zero at night and scaled up during business hours automatically.

    What We Learned

    Start with GitOps, not containers. The version control discipline was more transformative than the container runtime. Once you have reproducible builds, containerization becomes an infrastructure detail.

    Serverless containers are underrated. The industry buzz was all about Kubernetes, but Fargate was the right choice for a team without Kubernetes expertise. The Firecracker microVM isolation gave us security without complexity.

    Database migration is the critical path. Containers are easy. Moving the database is hard. We should have started the database work in parallel with the containerization.

    Incremental wins build confidence. We could have designed a perfect architecture for six months and then migrated. Instead, we shipped a working container in eight weeks, and the team's confidence snowballed from there.

    Epilogue

    Control Tower ran on ECS Fargate for two years before the next team touched the architecture. It became the reference pattern for every subsequent cloud migration in the organization.

    The codebase that "couldn't run in containers" ran in production for 700+ days without a deployment-related incident. The team that had never used Docker was now coaching other teams on ECS task definitions and Blue/Green deployments.

    The accepted wisdom was wrong. And proving it was the most important thing the CCoE ever did.

    *Built on a home lab, powered by local models, and owned by Andrew Katana.*

    Built on a home lab, powered by local models, and owned by Andrew Katana.

    Connect on LinkedIn →