Fargate + Lambda are better together

  • Thread starter Thread starter Daniele Frasca
  • Start date Start date
D

Daniele Frasca

Guest
After many years working with Serverless at a certain scale, I am starting to wonder about some things. I have been fortunate enough to attend numerous conferences where I have learned about the great potential of serverless computing and the various options available. At the same time, I noticed a separation between ECS Fargate and Lambda that makes it difficult for many to make a choice.

If I ask around, I hear the same story:

  1. Fargate is cheaper at scale
  2. Lambda is better for spiky traffic

I often wonder why running both of them simultaneously with the same code base is not possible.
How difficult can it be? Underground rumours, now three years old, suggested that AWS is working on this. However, at each re:Invent, there has been little to nothing announced for the next generation of Serverless applications, so I decided to test it myself.

The AWS option:




Features:

  • Run web applications on AWS Lambda
  • Supports Amazon API Gateway Rest API and HTTP API endpoints, Lambda Function URLs, and Application Load Balancer
  • Supports Lambda managed runtimes, custom runtimes and Docker OCI images
  • Supports any web frameworks and languages, with no new code dependency to include
  • Automatic encode binary response
  • Enables graceful shutdown
  • Supports response payload compression
  • Supports response streaming
  • Supports non-HTTP event triggers

With this, you now have 1 code that runs with both services.

If I do not want to use it for any reason, it is also easier to go around.

I still have my regular Lambda handler:


Code:
// init outside of the handler

export const handler = async (event: ALBEvent): Promise<ALBResult> => {
   // code
}

Now you need your ECS entry point:


Code:
import { ALBEvent, ALBResult } from "aws-lambda/trigger/alb";
import express, { Request, Response } from "express";

// Import your existing handlers
import { handler as myLambdaHandler } from "./handlers/myLambda/index";

const app = express();
const port = parseInt(process.env.PORT || "3000", 10);

// Middleware
app.use(express.json());

app.get("/yourPath", async (req: Request, res: Response) => {
  try {
    const albEvent = createALBEvent(req);
    const result = await myLambdaHandler(albEvent);
    sendALBResponse(res, result);
  } catch (error) {
    console.log(
      JSON.stringify({
        level: "error",
        message: "Error in myLambdaHandler",
        error: error?.toString(),
        stack: error?.stack,
      }),
    );
    res.status(500).json({ error: "Internal server error" });
  }
});

// Helper function to convert Express req/res to ALB event format
function createALBEvent(req: Request): ALBEvent {
  return {
    requestContext: {
      elb: {
        targetGroupArn: "arn:aws:elasticloadbalancing:region:account:targetgroup/target-group-name",
      },
    },
    httpMethod: req.method,
    path: req.path,
    queryStringParameters: (req.query as { [key: string]: string }) || null,
    headers: (req.headers as { [key: string]: string }) || {},
    body: req.method === "POST" || req.method === "PUT" ? JSON.stringify(req.body) : null,
    isBase64Encoded: false,
  };
}

// Helper function to send ALB response through Express
function sendALBResponse(res: Response, albResult: ALBResult): void {
  res.status(albResult.statusCode);

  if (albResult.headers) {
    Object.entries(albResult.headers).forEach(([key, value]) => {
      if (typeof value === "string") {
        res.set(key, value);
      } else if (typeof value === "number") {
        res.set(key, value.toString());
      }
    });
  }

  if (albResult.body) {
    if (albResult.isBase64Encoded) {
      res.send(Buffer.from(albResult.body, "base64"));
    } else {
      // Handle both string and parsed JSON content
      try {
        // Try to parse as JSON first
        const parsed = JSON.parse(albResult.body);
        res.json(parsed);
      } catch {
        // If not JSON, send as plain text
        res.send(albResult.body);
      }
    }
  } else {
    res.end();
  }
}


// Start the server
app.listen(port, () => {
  console.log(
    JSON.stringify({
      level: "info",
      message: "Server running",
      port: port,
      runtime: process.env.AWS_LAMBDA_FUNCTION_NAME ? "lambda": "ecs",
      region: process.env.REGION || "unknown",
      authType: process.env.AUTH_TYPE || "unknown",
    }),
  );
});

The wrapper above will be referenced from the Dockerfile used for ECS.

The cool part?​


Imagine two applications that are complementary to each other:

  • AWS Fargate provides a steady, always‑ready core for predictable, high‑volume or latency‑sensitive requests,
  • AWS Lambda acts as an elastic reflex that expands instantly for sudden bursts or temporary overload.

Using Lambda only often means paying a higher per-request price during long, sustained traffic and occasionally experiencing cold start delays. Using Fargate only forces provisioning for peak usage (wasting money off-peak) and risks overloads and outages while autoscaling catches up.

Combining them removes that trade‑off because I have a unified codebase that can run in both environments, traffic can be smoothly balanced (for example 30/70 β†’ 60/40 β†’ 85/15) based on real demand, and protective "task‑aware" routing ensures a small number of container tasks is never flooded while Lambda stands by as a safety for any spike and or ECS outage when they occurs.

This hybrid setup enables many things, like:

  • Move one endpoint or path at a time
  • Safer deployments (shift traffic away before replacing tasks), and more precise cost control
  • Resilience in terms of services, what are the chances that both of them will stop working
  • Faster innovation, you can redirect part of the traffic to one service before making it stable

Is Fargate really cheaper than Lambda?​


On paper yes, but in reality, no.
Containers (Fargate in this case) appear more cost-effective because we have a steady vCPU/GB-hour rate compared to Lambda.
The reality check and the data that I have show the opposite. I have to choose task sizes (vCPU/memory ratios), tune autoscaling targets, decide on scale-in cooldowns, set alarm thresholds, pick concurrency buffers, warm new tasks before they receive load, and revisit all of this each time traffic shape, code efficiency, or dependency latency changes.
Each change is another way to under-provision (causing throttling, retries, and user latency) or over-provision (resulting in silent waste, which is the default solution in 99% of cases).

Every tech decision has an associated labour cost, including observability dashboards, runbooks, on-call pages, postmortems, tuning cycles, regression risks after refactors, and load testing. By contrast, Lambda pays the raw unit price, which I refer to as the enterprise fee; however, the platform absorbs many failure modes that I would otherwise have to plan for and work around.



What's Next: When (Not If) Containers Misbehave​


This article is the high‑level "why". In the next part, I will try to go deeper into the uncomfortable truth behind the simple phrase "just run it on container" and touch on the scenarios where container workloads actually fail in production and then I will show you how to map each of those failure modes to a traffic controller I built that proactively adapt traffic between Lambda and ECS based on real-time traffic patterns.

Continue reading...
 


Join 𝕋𝕄𝕋 on Telegram
Channel PREVIEW:
Back
Top