\begin{article}

LaTeX PDF Generation in Python: From subprocess to REST API

The subprocess approach to LaTeX in Python is painful — version conflicts, temp file management, missing packages. Here is the clean REST API alternative with full code examples.

Apr 1, 2026·5 min read·

tutorial api

Python is one of the most common languages for generating documents programmatically — reports, invoices, certificates, scientific output. LaTeX produces the highest-quality PDFs, but the traditional Python approach of calling subprocess.run(["pdflatex", ...]) is fragile and hard to deploy. This post shows both approaches so you can see the difference.

The subprocess Approach (And Its Pain)

The naive Python approach calls the system pdflatex binary:

python

import subprocess
import tempfile
import os

def compile_latex_subprocess(latex_source: str) -> bytes:
    with tempfile.TemporaryDirectory() as tmpdir:
        tex_path = os.path.join(tmpdir, "document.tex")
        pdf_path = os.path.join(tmpdir, "document.pdf")

        with open(tex_path, "w", encoding="utf-8") as f:
            f.write(latex_source)

        result = subprocess.run(
            ["pdflatex", "-interaction=nonstopmode", "-output-directory", tmpdir, tex_path],
            capture_output=True,
            text=True,
            timeout=60,
        )

        if result.returncode != 0:
            # Parse the log for the actual error
            raise RuntimeError(f"pdflatex failed:\n{result.stdout[-2000:]}")

        if not os.path.exists(pdf_path):
            raise RuntimeError("pdflatex produced no output")

        with open(pdf_path, "rb") as f:
            return f.read()

This works locally if pdflatex is installed. It breaks in production because:

pdflatex must be installed on the deployment server
The installed TeX Live version must include all packages your templates use
Temp file cleanup can fail and fill disk
The subprocess approach does not handle multi-pass compilation (bibliography, cross-references)
Timeouts are tricky to enforce at the subprocess level
Docker images balloon to 4 GB

The REST API Approach

Replace the subprocess with an HTTP call:

python

import os
import requests

def compile_latex(latex_source: str, engine: str = "pdflatex") -> bytes:
    response = requests.post(
        "https://api.formatex.io/api/v1/compile",
        headers={"X-API-Key": os.environ["FORMATEX_KEY"]},
        json={"content": latex_source, "engine": engine},
    )

    if not response.ok:
        error = response.json()
        raise RuntimeError(error.get("log") or error.get("error") or "Unknown error")

    return response.content

That is the entire integration. No subprocess management, no temp files, no system dependencies.

python

# Save the result
pdf_bytes = compile_latex(r"""
\documentclass{article}
\begin{document}
Hello from Python!
\end{document}
""")

with open("output.pdf", "wb") as f:
    f.write(pdf_bytes)

Error Handling

LaTeX API error codes return HTTP 400 with a JSON body containing the TeX log. Parse it to surface useful errors:

python

import os
import requests

class LatexCompilationError(Exception):
    def __init__(self, message: str, log: str):
        super().__init__(message)
        self.log = log

    def first_error(self) -> str:
        """Extract the first error line from the TeX log."""
        for line in self.log.splitlines():
            if line.startswith("!"):
                return line
        return self.log[:200]


def compile_latex(source: str, engine: str = "pdflatex") -> bytes:
    response = requests.post(
        "https://api.formatex.io/api/v1/compile",
        headers={
            "X-API-Key": os.environ["FORMATEX_KEY"],
            "Content-Type": "application/json",
        },
        json={"content": source, "engine": engine},
        timeout=130,  # slightly above the Pro plan's 120s timeout
    )

    if response.status_code == 400:
        body = response.json()
        raise LatexCompilationError(
            message="LaTeX compilation failed",
            log=body.get("log", body.get("error", "")),
        )

    response.raise_for_status()
    return response.content


# Usage
try:
    pdf = compile_latex(my_template)
except LatexCompilationError as e:
    print(f"LaTeX error: {e.first_error()}")
    print(f"Full log:\n{e.log}")

Set your requests timeout slightly above the API's plan timeout. This prevents the HTTP connection from hanging indefinitely if the API is slow to respond, while still allowing the full compilation window to complete.

Async with httpx

For async Python applications (FastAPI, async Django, async Flask), use httpx:

python

import os
import httpx

async def compile_latex_async(source: str, engine: str = "pdflatex") -> bytes:
    async with httpx.AsyncClient(timeout=130) as client:
        response = await client.post(
            "https://api.formatex.io/api/v1/compile",
            headers={"X-API-Key": os.environ["FORMATEX_KEY"]},
            json={"content": source, "engine": engine},
        )

        if response.status_code == 400:
            body = response.json()
            raise RuntimeError(body.get("log") or body.get("error"))

        response.raise_for_status()
        return response.content


# FastAPI endpoint
from fastapi import FastAPI
from fastapi.responses import Response

app = FastAPI()

@app.post("/generate-pdf")
async def generate_pdf(latex_source: str):
    pdf_bytes = await compile_latex_async(latex_source)
    return Response(
        content=pdf_bytes,
        media_type="application/pdf",
        headers={"Content-Disposition": "attachment; filename=document.pdf"},
    )

Storing PDFs

Once you have the PDF bytes, storing them is straightforward:

python

import boto3
import io

def store_pdf_s3(pdf_bytes: bytes, key: str) -> str:
    """Upload PDF to S3 and return the object URL."""
    s3 = boto3.client("s3")
    bucket = os.environ["PDF_BUCKET"]

    s3.upload_fileobj(
        io.BytesIO(pdf_bytes),
        bucket,
        key,
        ExtraArgs={"ContentType": "application/pdf"},
    )

    return f"https://{bucket}.s3.amazonaws.com/{key}"


# Complete flow: generate → store → return URL
async def generate_and_store(invoice_data: dict) -> str:
    latex = build_invoice_latex(invoice_data)
    pdf_bytes = await compile_latex_async(latex)
    url = store_pdf_s3(pdf_bytes, f"invoices/{invoice_data['id']}.pdf")
    return url

FormaTeX does not store your PDFs — every compilation is ephemeral. The PDF is streamed directly in the HTTP response body and deleted from the worker immediately. You are responsible for storing the bytes wherever you need them.

Choosing the Engine from Python

The choice of engine matters for output quality and feature support. For a full breakdown, see the complete guide to LaTeX engines.

python

def compile_document(
    source: str,
    *,
    has_bibliography: bool = False,
    needs_custom_fonts: bool = False,
) -> bytes:
    if has_bibliography:
        engine = "latexmk"
    elif needs_custom_fonts:
        engine = "xelatex"
    else:
        engine = "pdflatex"

    return compile_latex(source, engine=engine)

Beyond Sync Compilation

The examples above use the synchronous POST /compile endpoint. FormaTeX also offers:

Smart Compile (POST /compile/smart) — AI-powered error detection and auto-fix. If your LaTeX has errors, the AI pipeline fixes them automatically. See Smart Compile guide.
Async Compilation (POST /compile/async) — submit a job, get a job ID, poll for completion or receive a webhook. Ideal for long documents and batch PDF generation. See Async guide.

Get Started

Sign up for free — 15 compilations/month, no card required
API documentation — full endpoint schema and error format
Dashboard — API key management and usage tracking

LaTeX PDF Generation in Node.js and TypeScript — The same REST API pattern implemented in TypeScript with full error handling and Next.js integration
The Complete Guide to LaTeX Engines — When to choose pdfLaTeX, XeLaTeX, LuaLaTeX, or latexmk for your Python-generated documents
Why TeX Live Docker Images Are 4 GB — Why bundling TeX Live into your Python container is costly and how the API alternative avoids it
LaTeX API Error Codes: Complete Guide — Every 400, 422, and 429 error code you may encounter when calling the compile endpoint from Python
Async LaTeX Compilation and Webhooks — How to submit long-running LaTeX jobs asynchronously and receive results via webhook in your Python backend

\end{article}

Back to blog

\related{posts}