FormaTeX

\begin{article}

LaTeX PDFs in AWS Lambda via API

Compile LaTeX to PDF in AWS Lambda without the 250 MB layer limit — call a LaTeX compilation API and stay within the function size budget.

·9 min read·
LaTeX PDFs in AWS Lambda via API

If you have ever tried to run pdflatex inside an AWS Lambda function, you already know the problem: TeX Live is roughly 4 GB installed, Lambda's deployment package limit is 250 MB (unzipped), and even the most aggressively trimmed TeX Live subset blows past that ceiling before you add your own code. The standard workaround — calling an external LaTeX compilation API — is not a compromise, it is the correct architecture. This post shows you exactly how to wire it up, with working Node.js and Python examples and a production-ready SAM template.

Why TeX Live and Lambda Are Incompatible

Lambda enforces a hard 250 MB unzipped size limit for deployment packages, including layers. TeX Live minimal weighs around 300—400 MB; a full installation is closer to 4 GB. Even the community-maintained texlive-lambda layers that circulate on GitHub are perpetually out of date and still hover near the limit, leaving almost no room for your actual function code.

Beyond size, there are runtime concerns:

  • Cold starts: Spawning a child process for pdflatex inside a Lambda adds 500 ms—2 s of cold-start latency on top of the standard container init time.
  • Maintenance: You own the TeX Live version. When a package changes upstream or a security patch drops, you rebuild and redeploy the layer.
  • Concurrency: Each Lambda invocation spins up its own process. At high concurrency, you are running many parallel pdflatex processes with no shared state or caching.

The clean solution is to treat LaTeX compilation as an external service — exactly what FormatEx provides. Your Lambda function stays a few kilobytes, cold starts are sub-100 ms, and TeX Live maintenance is someone else's problem.

How the FormatEx API Works

FormatEx exposes a single compilation endpoint:

text
POST https://api.formatex.io/api/v1/compile
X-API-Key: <your-api-key>
Content-Type: application/json

The request body carries your LaTeX source and compilation options:

json
{
  "latex": "\\documentclass{article}\\begin{document}Hello\\end{document}",
  "engine": "pdflatex",
  "options": {}
}

A successful response streams back the compiled PDF as application/pdf. Errors return JSON with an "error" field and a 4xx/5xx status code.

Supported engines:

EngineAvailable on Plans
pdflatexFree, Developer, Pro, Scale
xelatexDeveloper, Pro, Scale
lualatexDeveloper, Pro, Scale
latexmkDeveloper, Pro, Scale

Get an API key at formatex.io — the free tier gives you 15 compilations per month with no credit card required.

Node.js Lambda Handler

The example below accepts a LaTeX string from the event payload, compiles it via FormatEx, and returns the PDF as a Base64-encoded body so API Gateway can forward it directly to the caller. For a deeper look at TypeScript patterns with error handling, see the full Node.js and TypeScript integration guide.

typescript
import { APIGatewayProxyHandler } from "aws-lambda";

const FORMATEX_URL = "https://api.formatex.io/api/v1/compile";

export const handler: APIGatewayProxyHandler = async (event) => {
  const body = JSON.parse(event.body ?? "{}");
  const latex: string = body.latex;

  if (!latex) {
    return {
      statusCode: 400,
      body: JSON.stringify({ error: "Missing latex field in request body" }),
    };
  }

  const response = await fetch(FORMATEX_URL, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "X-API-Key": process.env.FORMATEX_API_KEY!,
    },
    body: JSON.stringify({ engine: "pdflatex", latex }),
  });

  if (!response.ok) {
    const err = await response.json();
    return {
      statusCode: response.status,
      body: JSON.stringify({ error: err.error ?? "Compilation failed" }),
    };
  }

  const pdfBuffer = Buffer.from(await response.arrayBuffer());

  return {
    statusCode: 200,
    headers: {
      "Content-Type": "application/pdf",
      "Content-Disposition": 'attachment; filename="document.pdf"',
    },
    body: pdfBuffer.toString("base64"),
    isBase64Encoded: true,
  };
};

A few things worth noting:

  • fetch is available natively in the Node.js 18+ Lambda runtime — no axios or node-fetch needed.
  • process.env.FORMATEX_API_KEY is injected at deploy time via the SAM template (shown below). The key never appears in source code.
  • isBase64Encoded: true tells API Gateway to decode the body before sending the HTTP response, so callers receive a valid binary PDF.

Python Lambda Handler

If your stack is Python, the equivalent handler uses urllib.request from the standard library — no third-party dependencies:

python
import json
import os
import urllib.request
import urllib.error
from base64 import b64encode


FORMATEX_URL = "https://api.formatex.io/api/v1/compile"


def handler(event, context):
    try:
        body = json.loads(event.get("body") or "{}")
    except json.JSONDecodeError:
        return {"statusCode": 400, "body": json.dumps({"error": "Invalid JSON"})}

    latex = body.get("latex")
    if not latex:
        return {"statusCode": 400, "body": json.dumps({"error": "Missing latex field"})}

    payload = json.dumps({"engine": "pdflatex", "latex": latex}).encode()

    req = urllib.request.Request(
        FORMATEX_URL,
        data=payload,
        headers={
            "Content-Type": "application/json",
            "X-API-Key": os.environ["FORMATEX_API_KEY"],
        },
        method="POST",
    )

    try:
        with urllib.request.urlopen(req) as resp:
            pdf_bytes = resp.read()
    except urllib.error.HTTPError as exc:
        error_body = json.loads(exc.read().decode())
        return {
            "statusCode": exc.code,
            "body": json.dumps({"error": error_body.get("error", "Compilation failed")}),
        }

    return {
        "statusCode": 200,
        "headers": {
            "Content-Type": "application/pdf",
            "Content-Disposition": 'attachment; filename="document.pdf"',
        },
        "body": b64encode(pdf_bytes).decode(),
        "isBase64Encoded": True,
    }

Using only the standard library keeps the deployment package at a few kilobytes. The Python 3.12 Lambda runtime ships urllib — nothing to install.

Deploying with AWS SAM

The SAM template below provisions an API Gateway HTTP API backed by the Node.js Lambda. The FormatEx API key is stored in SSM Parameter Store and injected as an environment variable at deploy time — it never touches your source repository. For broader API key management and rotation best practices, including secrets manager patterns, see the dedicated authentication guide.

yaml
AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31

Globals:
  Function:
    Runtime: nodejs22.x
    Timeout: 30
    MemorySize: 256
    Environment:
      Variables:
        FORMATEX_API_KEY: !Sub "{{resolve:ssm:/formatex/api-key}}"

Resources:
  LatexCompileFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: dist/
      Handler: handler.handler
      Events:
        CompileApi:
          Type: HttpApi
          Properties:
            Path: /compile
            Method: POST

Outputs:
  ApiUrl:
    Value: !Sub "https://${ServerlessHttpApi}.execute-api.${AWS::Region}.amazonaws.com/compile"

Before deploying, store your key:

bash
aws ssm put-parameter \
  --name /formatex/api-key \
  --value "fex_your_actual_key_here" \
  --type SecureString

Then build and deploy:

bash
sam build
sam deploy --guided

--guided walks you through the stack name, region, and S3 bucket for artifacts on first run. Subsequent deploys skip the wizard.

CDK Equivalent

If you prefer CDK, the same infrastructure in TypeScript:

typescript
import * as cdk from "aws-cdk-lib";
import * as lambda from "aws-cdk-lib/aws-lambda-nodejs";
import * as apigwv2 from "aws-cdk-lib/aws-apigatewayv2";
import * as integrations from "aws-cdk-lib/aws-apigatewayv2-integrations";
import * as ssm from "aws-cdk-lib/aws-ssm";
import { Construct } from "constructs";

export class LatexStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const apiKey = ssm.StringParameter.valueForSecureStringParameter(
      this,
      "/formatex/api-key"
    );

    const fn = new lambda.NodejsFunction(this, "LatexCompile", {
      entry: "src/handler.ts",
      handler: "handler",
      runtime: cdk.aws_lambda.Runtime.NODEJS_22_X,
      timeout: cdk.Duration.seconds(30),
      memorySize: 256,
      environment: {
        FORMATEX_API_KEY: apiKey,
      },
    });

    const httpApi = new apigwv2.HttpApi(this, "LatexApi");

    httpApi.addRoutes({
      path: "/compile",
      methods: [apigwv2.HttpMethod.POST],
      integration: new integrations.HttpLambdaIntegration("Compile", fn),
    });

    new cdk.CfnOutput(this, "ApiUrl", { value: httpApi.url! });
  }
}

NodejsFunction bundles your TypeScript with esbuild at synth time. The final Lambda package is typically under 1 MB.

Handling Timeouts and Retries

LaTeX compilation time depends on document complexity. A simple one-page document compiles in under two seconds; a 50-page document with TikZ figures can take 20—30 seconds. Set your Lambda timeout to at least 30 seconds and align it with your API Gateway integration timeout (also 30 s by default for HTTP APIs).

For retry logic, treat FormatEx errors by category — a full breakdown of every status code is available in the LaTeX API error codes reference:

  1. 400 Bad Request — invalid LaTeX or unsupported engine on your plan. Do not retry; fix the source or upgrade your plan.
  2. 401 Unauthorized — invalid or missing API key. Do not retry; check the key.
  3. 429 Too Many Requests — rate limit hit. Retry with exponential backoff and jitter after the Retry-After header.
  4. 5xx Server Error — transient. Retry up to three times with jitter.

Storing Generated PDFs

The Lambda handler above returns the PDF directly to the API Gateway caller. For asynchronous workflows — where a background job generates a PDF and the user downloads it later — write the buffer to S3 instead:

typescript
import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3";
import { getSignedUrl } from "@aws-sdk/s3-request-presigner";
import { GetObjectCommand } from "@aws-sdk/client-s3";

const s3 = new S3Client({});

// After receiving pdfBuffer from FormatEx:
await s3.send(
  new PutObjectCommand({
    Bucket: process.env.PDF_BUCKET!,
    Key: `pdfs/${crypto.randomUUID()}.pdf`,
    Body: pdfBuffer,
    ContentType: "application/pdf",
  })
);

Generate a presigned URL with a short TTL (e.g., 15 minutes) and return it to the caller. This pattern works well for invoice generation, certificate issuance, and report pipelines where the client polls for completion. For more advanced async compilation patterns including webhooks and polling, see the dedicated guide.

Summary

The approach covered here is:

  1. Keep your Lambda function small — no TeX Live layer, no binary dependencies.
  2. POST LaTeX source to https://api.formatex.io/api/v1/compile with X-API-Key in the header.
  3. Receive the compiled PDF as a binary response and return or store it.
  4. Store the API key in SSM Parameter Store, never in code.

This eliminates the 250 MB layer limit problem entirely, removes TeX Live version management from your ops backlog, and keeps cold starts fast since your function has nothing heavy to initialize.

Sign up at formatex.io to get your API key. The free tier is enough to test the integration end-to-end, and paid plans start at $12/month for 500 compilations with all four engines.

\end{article}

Back to blog

\related{posts}

One quick thing

We track anonymous usage — page views, feature usage, compilation events — to understand what works and what doesn't. No ads, no personal data, no third-party sharing.

Cookie policy