FormaTeX

\begin{article}

Automating Academic Paper PDFs with the LaTeX API

Academic papers, theses, and scientific reports all require LaTeX. Here is how to automate PDF generation for academic documents using the FormaTeX API in your research pipeline.

·5 min read·
Automating Academic Paper PDFs with the LaTeX API

Every major academic publisher requires LaTeX submissions. arXiv runs entirely on LaTeX. Journal templates from IEEE, ACM, Springer, and Elsevier are LaTeX classes. If you work in research, data science, or any scientific field, automating LaTeX PDF generation is a significant productivity win — and the FormaTeX API makes it accessible from any programming environment.

The Academic PDF Pipeline

A typical academic document generation workflow involves:

  1. Research and data collection
  2. Analysis (Python, R, Julia)
  3. Figure generation (matplotlib, ggplot2, pgfplots)
  4. LaTeX document composition
  5. Multi-pass compilation (for bibliography and cross-references)
  6. Review, revision, and re-compilation
  7. Submission

Steps 4–6 are where FormaTeX fits in. You write the LaTeX, we compile it.

Common Document Types and Templates

Conference Papers (IEEE/ACM)

latex
\documentclass[conference]{IEEEtran}
% or
\documentclass{acmart}

\title{Efficient Compilation of Scientific Documents via REST API}
\author{
  \IEEEauthorblockN{Jane Smith}
  \IEEEauthorblockA{Department of Computer Science\\
    MIT, Cambridge MA\\
    [email protected]}
}

\begin{document}
\maketitle
\begin{abstract}
We present FormaTeX, a RESTful API for on-demand LaTeX compilation...
\end{abstract}
\end{document}

Theses (memoir or KOMA-Script)

latex
\documentclass[12pt,a4paper]{memoir}
\usepackage{amsmath,amssymb,amsthm}
\usepackage[backend=biber,style=authoryear]{biblatex}
\addbibresource{references.bib}

arXiv Preprints

arXiv accepts pdflatex and XeLaTeX. Most preprints use standard article class or a custom arxiv.sty:

latex
\documentclass{article}
\usepackage{arxiv}
\usepackage{amsmath,amssymb}
\usepackage[numbers]{natbib}

Engine Selection for Academics

Document typeEngineReason
IEEE/ACM conferencepdflatexRequired by most templates
arXiv preprintpdflatexDefault arXiv engine
Custom fonts, UnicodexelatexFont flexibility
Documents with bibliographylatexmkHandles multi-pass automatically
Programmatic contentlualatexLua scripting

Use latexmk for any document with \bibliography{} or \addbibresource{}. It runs biber/bibtex automatically and repeats pdflatex until all references are stable. Without it, you get [?] where citations should be.

CI/CD for Researchers

Academic papers benefit from version-controlled, automatically compiled PDFs. Every push to the paper repository triggers a recompile:

yaml
# .github/workflows/compile-paper.yml
name: Compile Paper

on:
  push:
    paths:
      - "paper/**.tex"
      - "paper/**.bib"
      - "paper/figures/**"
  pull_request:

jobs:
  compile:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Compile paper with bibliography
        env:
          FORMATEX_KEY: ${{ secrets.FORMATEX_KEY }}
        run: |
          curl -s -X POST https://api.formatex.io/api/v1/compile \
            -H "X-API-Key: $FORMATEX_KEY" \
            -H "Content-Type: application/json" \
            -d "{\"content\": $(cat paper/main.tex | jq -Rs .), \"engine\": \"latexmk\"}" \
            --output paper/paper.pdf

      - name: Upload compiled paper
        uses: actions/upload-artifact@v4
        with:
          name: paper-pdf
          path: paper/paper.pdf
          retention-days: 90

This gives every collaborator a downloadable PDF from every commit, without installing TeX Live anywhere.

Python Integration for Data-Driven Papers

Research papers often combine Python-generated figures and LaTeX:

python
import os
import matplotlib.pyplot as plt
import requests
import base64

def generate_paper_pdf(results: dict) -> bytes:
    """Generate a paper PDF with embedded figures from Python data."""

    # Generate figure
    fig, ax = plt.subplots(figsize=(6, 4))
    ax.plot(results["x"], results["y"])
    ax.set_xlabel("Input size $n$")
    ax.set_ylabel("Time (ms)")
    ax.set_title("Compilation time vs. document size")

    # Save as PDF (vector — LaTeX handles it better than PNG)
    fig.savefig("/tmp/figure.pdf", bbox_inches="tight")
    plt.close(fig)

    # Read figure for inclusion
    # In a real setup, you would upload to a URL or use pgfplots directly
    latex = r"""
\documentclass{article}
\usepackage{amsmath}
\usepackage{graphicx}
\usepackage[backend=biber]{biblatex}

\title{Performance Analysis of the FormaTeX API}
\author{Research Team}
\date{\today}

\begin{document}
\maketitle

\begin{abstract}
We analyze the compilation performance of the FormaTeX REST API
across document types, engines, and document sizes.
\end{abstract}

\section{Introduction}
LaTeX compilation via REST API offers significant advantages
over self-hosted solutions~\cite{smith2026}.

\section{Results}
As shown in Figure~\ref{fig:perf}, compilation time scales
linearly with document size for pdflatex.

\section{Conclusion}
The FormaTeX API provides predictable, low-latency LaTeX
compilation suitable for production SaaS applications.

\printbibliography

\end{document}
"""

    response = requests.post(
        "https://api.formatex.io/api/v1/compile",
        headers={"X-API-Key": os.environ["FORMATEX_KEY"]},
        json={"content": latex, "engine": "latexmk"},
    )

    if not response.ok:
        error = response.json()
        raise RuntimeError(error.get("log", error.get("error")))

    return response.content

For figures, generate them as PDF or EPS vector formats rather than PNG. LaTeX includes vector figures at full resolution, while rasterized figures look blurry in print. matplotlib supports fig.savefig("figure.pdf") directly.

Handling Bibliography with the API

Documents with bibliography require latexmk to resolve citation references:

python
# Multi-file documents: embed bibliography inline
latex_with_bibliography = r"""
\documentclass{article}
\usepackage[backend=bibtex]{biblatex}

% For single-file API calls, embed the .bib content directly
\begin{filecontents*}{refs.bib}
@article{knuth1984tex,
  author = {Knuth, Donald E.},
  title  = {The {\TeX} Book},
  year   = {1984},
  publisher = {Addison-Wesley}
}
\end{filecontents*}

\addbibresource{refs.bib}

\begin{document}
The TeX typesetting system was created by \textcite{knuth1984tex}.
\printbibliography
\end{document}
"""

response = requests.post(
    "https://api.formatex.io/api/v1/compile",
    headers={"X-API-Key": os.environ["FORMATEX_KEY"]},
    json={"content": latex_with_bibliography, "engine": "latexmk"},
)

The \begin{filecontents*} trick embeds the .bib file inline within the document, making the entire compilation self-contained in a single API call.

File Uploads for Multi-File Documents

For more complex papers with separate .bib files, images, and style files, FormaTeX supports file uploads via multipart form data or base64-encoded files. This is cleaner than the filecontents* approach for larger projects — upload your .bib, figures, and custom .sty files alongside the main .tex source.

Get Started

\end{article}

Back to blog

\related{posts}

One quick thing

We track anonymous usage — page views, feature usage, compilation events — to understand what works and what doesn't. No ads, no personal data, no third-party sharing.

Cookie policy