FormaTeX

\usepackage{aws-lambda}

Serverless LaTeX PDF generation on Lambda

Compile LaTeX documents to PDF inside AWS Lambda functions using the FormaTeX API. No 500 MB TeX Live layer, no custom runtime — just an HTTP call and a PDF response.

\section{Why serverless + LaTeX}

Zero-infrastructure PDF generation

TeX Live weighs over 4 GB. Bundling it as a Lambda layer is slow, expensive, and fragile. The FormaTeX API offloads compilation entirely so your functions stay small and fast.

No TeX Live layer

Keep your Lambda deployment package under 10 MB. FormaTeX handles the full TeX Live environment on managed infrastructure.

Pay per compilation

Lambda bills per invocation; FormaTeX bills per compilation. Scale to zero when idle and burst to thousands of PDFs under load.

Isolated execution

Every compilation runs in a sandboxed container on FormaTeX servers. Your Lambda function never executes arbitrary LaTeX locally.

\section{Python handler}

Python Lambda handler

Uses only the Python standard library plus boto3 (pre-installed on every Lambda runtime). The compiled PDF is stored in S3 and the bucket key is returned.

  • Runtime: Python 3.12 or later
  • Dependencies: boto3 only (included in the Lambda runtime)
  • Timeout: set to at least 30 seconds for complex documents
import json
import os
import urllib.request
import boto3


def lambda_handler(event, context):
    """Compile LaTeX to PDF via FormaTeX API and store the result in S3."""
    source = event.get("source", "")
    engine = event.get("engine", "pdflatex")

    if not source:
        return {
            "statusCode": 400,
            "body": json.dumps({"error": "source is required"}),
        }

    # API key stored as a Lambda environment variable (or fetched from
    # Secrets Manager at cold-start — see the IAM section below).
    api_key = os.environ["FORMATEX_API_KEY"]

    payload = json.dumps({"source": source, "engine": engine}).encode()
    req = urllib.request.Request(
        "https://api.formatex.io/v1/compile/sync",
        data=payload,
        headers={
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json",
        },
        method="POST",
    )

    with urllib.request.urlopen(req, timeout=25) as resp:
        pdf_bytes = resp.read()

    s3 = boto3.client("s3")
    bucket = os.environ["OUTPUT_BUCKET"]
    key = f"pdfs/{context.aws_request_id}.pdf"
    s3.put_object(Bucket=bucket, Key=key, Body=pdf_bytes, ContentType="application/pdf")

    return {
        "statusCode": 200,
        "body": json.dumps({"bucket": bucket, "key": key}),
    }

\section{Node.js handler}

Node.js Lambda handler

TypeScript-first handler using only Node.js built-ins and the AWS SDK v3 S3 client. The API key is cached in the module scope so warm invocations skip the environment variable lookup.

  • Runtime: Node.js 20.x or later (ESM)
  • Dependencies: @aws-sdk/client-s3 (included in the Lambda runtime)
  • Timeout: set to at least 30 seconds
import https from "node:https"
import { S3Client, PutObjectCommand } from "@aws-sdk/client-s3"

const s3 = new S3Client({})

// Cache the API key across warm invocations
let cachedApiKey: string | undefined

export const handler = async (event: {
  source: string
  engine?: string
  requestId?: string
}) => {
  const { source, engine = "pdflatex" } = event

  if (!source) {
    return { statusCode: 400, body: JSON.stringify({ error: "source is required" }) }
  }

  if (!cachedApiKey) {
    cachedApiKey = process.env.FORMATEX_API_KEY
  }
  if (!cachedApiKey) throw new Error("FORMATEX_API_KEY is not set")

  const pdfBuffer = await compilePdf(cachedApiKey, source, engine)

  const bucket = process.env.OUTPUT_BUCKET!
  const key = `pdfs/${event.requestId ?? Date.now()}.pdf`

  await s3.send(
    new PutObjectCommand({
      Bucket: bucket,
      Key: key,
      Body: pdfBuffer,
      ContentType: "application/pdf",
    }),
  )

  return { statusCode: 200, body: JSON.stringify({ bucket, key }) }
}

function compilePdf(apiKey: string, source: string, engine: string): Promise<Buffer> {
  return new Promise((resolve, reject) => {
    const body = JSON.stringify({ source, engine })
    const options = {
      hostname: "api.formatex.io",
      path: "/v1/compile/sync",
      method: "POST",
      headers: {
        Authorization: `Bearer ${apiKey}`,
        "Content-Type": "application/json",
        "Content-Length": Buffer.byteLength(body),
      },
    }
    const req = https.request(options, (res) => {
      const chunks: Buffer[] = []
      res.on("data", (c: Buffer) => chunks.push(c))
      res.on("end", () => {
        if (res.statusCode !== 200) {
          reject(new Error(`FormaTeX error: HTTP ${res.statusCode}`))
        } else {
          resolve(Buffer.concat(chunks))
        }
      })
    })
    req.on("error", reject)
    req.write(body)
    req.end()
  })
}

\section{IAM permissions}

Storing the API key in Secrets Manager

For production workloads, store your FormaTeX API key in AWS Secrets Manager rather than a plain environment variable. Attach the policy below to your Lambda execution role and fetch the secret at cold-start.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ReadFormaTeXApiKey",
      "Effect": "Allow",
      "Action": "secretsmanager:GetSecretValue",
      "Resource": "arn:aws:secretsmanager:us-east-1:123456789012:secret:formatex/api-key-*"
    },
    {
      "Sid": "WriteOutputBucket",
      "Effect": "Allow",
      "Action": ["s3:PutObject"],
      "Resource": "arn:aws:s3:::your-output-bucket/pdfs/*"
    }
  ]
}

# Store the secret in Secrets Manager (one-time setup):
aws secretsmanager create-secret \
  --name formatex/api-key \
  --secret-string '{"FORMATEX_API_KEY":"your-key-here"}'

# Then fetch it at cold-start in Python:
# import boto3, json
# secret = boto3.client("secretsmanager").get_secret_value(SecretId="formatex/api-key")
# api_key = json.loads(secret["SecretString"])["FORMATEX_API_KEY"]

\section{Cold start considerations}

Minimise latency on cold starts

Cache secrets at module scope

Fetch your API key from Secrets Manager once, outside the handler function. The value is reused across all warm invocations without an additional network call.

Use provisioned concurrency for latency-sensitive paths

Enable provisioned concurrency on the Lambda version to eliminate cold starts entirely on user-facing PDF generation endpoints. Use on-demand concurrency for background batch jobs.

Set a realistic timeout

Complex LaTeX documents with TikZ, bibliography, or multiple compilation passes can take 10–20 seconds. Set the Lambda timeout to at least 30 seconds and configure your API Gateway or ALB integration timeout to match.

\section{Related guides}

More SDK and language guides

The FormaTeX API is a plain HTTP endpoint. Use it from any language or runtime.

Ready to generate PDFs on Lambda?

Free tier — 15 API compilations per month. No credit card required.

One quick thing

We track anonymous usage — page views, feature usage, compilation events — to understand what works and what doesn't. No ads, no personal data, no third-party sharing.

Cookie policy