\begin{article}
LaTeX PDF Generation in Python: From subprocess to REST API
The subprocess approach to LaTeX in Python is painful — version conflicts, temp file management, missing packages. Here is the clean REST API alternative with full code examples.

Python is one of the most common languages for generating documents programmatically — reports, invoices, certificates, scientific output. LaTeX produces the highest-quality PDFs, but the traditional Python approach of calling subprocess.run(["pdflatex", ...]) is fragile and hard to deploy. This post shows both approaches so you can see the difference.
The subprocess Approach (And Its Pain)
The naive Python approach calls the system pdflatex binary:
import subprocess
import tempfile
import os
def compile_latex_subprocess(latex_source: str) -> bytes:
with tempfile.TemporaryDirectory() as tmpdir:
tex_path = os.path.join(tmpdir, "document.tex")
pdf_path = os.path.join(tmpdir, "document.pdf")
with open(tex_path, "w", encoding="utf-8") as f:
f.write(latex_source)
result = subprocess.run(
["pdflatex", "-interaction=nonstopmode", "-output-directory", tmpdir, tex_path],
capture_output=True,
text=True,
timeout=60,
)
if result.returncode != 0:
# Parse the log for the actual error
raise RuntimeError(f"pdflatex failed:\n{result.stdout[-2000:]}")
if not os.path.exists(pdf_path):
raise RuntimeError("pdflatex produced no output")
with open(pdf_path, "rb") as f:
return f.read()This works locally if pdflatex is installed. It breaks in production because:
- pdflatex must be installed on the deployment server
- The installed TeX Live version must include all packages your templates use
- Temp file cleanup can fail and fill disk
- The subprocess approach does not handle multi-pass compilation (bibliography, cross-references)
- Timeouts are tricky to enforce at the subprocess level
- Docker images balloon to 4 GB
The REST API Approach
Replace the subprocess with an HTTP call:
import os
import requests
def compile_latex(latex_source: str, engine: str = "pdflatex") -> bytes:
response = requests.post(
"https://api.formatex.io/api/v1/compile",
headers={"X-API-Key": os.environ["FORMATEX_KEY"]},
json={"content": latex_source, "engine": engine},
)
if not response.ok:
error = response.json()
raise RuntimeError(error.get("log") or error.get("error") or "Unknown error")
return response.contentThat is the entire integration. No subprocess management, no temp files, no system dependencies.
# Save the result
pdf_bytes = compile_latex(r"""
\documentclass{article}
\begin{document}
Hello from Python!
\end{document}
""")
with open("output.pdf", "wb") as f:
f.write(pdf_bytes)Error Handling
LaTeX errors return HTTP 400 with a JSON body containing the TeX log. Parse it to surface useful errors:
import os
import requests
class LatexCompilationError(Exception):
def __init__(self, message: str, log: str):
super().__init__(message)
self.log = log
def first_error(self) -> str:
"""Extract the first error line from the TeX log."""
for line in self.log.splitlines():
if line.startswith("!"):
return line
return self.log[:200]
def compile_latex(source: str, engine: str = "pdflatex") -> bytes:
response = requests.post(
"https://api.formatex.io/api/v1/compile",
headers={
"X-API-Key": os.environ["FORMATEX_KEY"],
"Content-Type": "application/json",
},
json={"content": source, "engine": engine},
timeout=130, # slightly above the Pro plan's 120s timeout
)
if response.status_code == 400:
body = response.json()
raise LatexCompilationError(
message="LaTeX compilation failed",
log=body.get("log", body.get("error", "")),
)
response.raise_for_status()
return response.content
# Usage
try:
pdf = compile_latex(my_template)
except LatexCompilationError as e:
print(f"LaTeX error: {e.first_error()}")
print(f"Full log:\n{e.log}")Set your requests timeout slightly above the API's plan timeout. This prevents the HTTP connection from hanging indefinitely if the API is slow to respond, while still allowing the full compilation window to complete.
Async with httpx
For async Python applications (FastAPI, async Django, async Flask), use httpx:
import os
import httpx
async def compile_latex_async(source: str, engine: str = "pdflatex") -> bytes:
async with httpx.AsyncClient(timeout=130) as client:
response = await client.post(
"https://api.formatex.io/api/v1/compile",
headers={"X-API-Key": os.environ["FORMATEX_KEY"]},
json={"content": source, "engine": engine},
)
if response.status_code == 400:
body = response.json()
raise RuntimeError(body.get("log") or body.get("error"))
response.raise_for_status()
return response.content
# FastAPI endpoint
from fastapi import FastAPI
from fastapi.responses import Response
app = FastAPI()
@app.post("/generate-pdf")
async def generate_pdf(latex_source: str):
pdf_bytes = await compile_latex_async(latex_source)
return Response(
content=pdf_bytes,
media_type="application/pdf",
headers={"Content-Disposition": "attachment; filename=document.pdf"},
)Storing PDFs
Once you have the PDF bytes, storing them is straightforward:
import boto3
import io
def store_pdf_s3(pdf_bytes: bytes, key: str) -> str:
"""Upload PDF to S3 and return the object URL."""
s3 = boto3.client("s3")
bucket = os.environ["PDF_BUCKET"]
s3.upload_fileobj(
io.BytesIO(pdf_bytes),
bucket,
key,
ExtraArgs={"ContentType": "application/pdf"},
)
return f"https://{bucket}.s3.amazonaws.com/{key}"
# Complete flow: generate → store → return URL
async def generate_and_store(invoice_data: dict) -> str:
latex = build_invoice_latex(invoice_data)
pdf_bytes = await compile_latex_async(latex)
url = store_pdf_s3(pdf_bytes, f"invoices/{invoice_data['id']}.pdf")
return urlFormaTeX does not store your PDFs — every compilation is ephemeral. The PDF is streamed directly in the HTTP response body and deleted from the worker immediately. You are responsible for storing the bytes wherever you need them.
Choosing the Engine from Python
def compile_document(
source: str,
*,
has_bibliography: bool = False,
needs_custom_fonts: bool = False,
) -> bytes:
if has_bibliography:
engine = "latexmk"
elif needs_custom_fonts:
engine = "xelatex"
else:
engine = "pdflatex"
return compile_latex(source, engine=engine)Beyond Sync Compilation
The examples above use the synchronous POST /compile endpoint. FormaTeX also offers:
- Smart Compile (
POST /compile/smart) — AI-powered error detection and auto-fix. If your LaTeX has errors, the AI pipeline fixes them automatically. See Smart Compile guide. - Async Compilation (
POST /compile/async) — submit a job, get a job ID, poll for completion or receive a webhook. Ideal for long documents and batch processing. See Async guide.
Get Started
- Sign up for free — 15 compilations/month, no card required
- API documentation — full endpoint schema and error format
- Dashboard — API key management and usage tracking
\end{article}
\related{posts}




