FormaTeX

\documentclass{article}

LaTeX Research Paper Template

Production-ready LaTeX templates for journal articles and conference papers. Covers standard article, IEEEtran, and REVTeX4-2 — with figures, tables, equations, and bibliography pre-configured.

View full preamble

\section{Choosing a Document Class}

article vs IEEEtran vs REVTeX

The document class determines your paper's layout, citation format, and submission requirements. Match it to your target journal or conference.

Feature
article
IEEEtran
REVTeX4-2
Document class\documentclass{article}\documentclass[journal]{IEEEtran}\documentclass[aps,prl]{revtex4-2}
Column layoutSingle columnTwo columnTwo column
Font size10–12pt10pt (fixed)10pt (fixed)
Abstract env\begin{abstract}\begin{abstract}\begin{abstract}
Citation styleAny (biblatex)\cite → [1]\cite → [1] or author–year
Best forPreprints, arXiv, general journalsIEEE Transactions, ICASSP, CVPRPhysical Review, Applied Physics

\section{Paper Structure}

Full paper structure walkthrough

Every section of a research paper has its own LaTeX idioms. Here's what matters in each.

Abstract

Wrap in \begin{abstract}…\end{abstract}. Keep to 150–250 words. Avoid citations and equations — most journals process abstracts separately for their databases.

Introduction

Use \section{Introduction}. State the problem, summarise prior work with \cite, and enumerate your contributions with enumerate. Keep subsections shallow in short papers.

Methods

Present equations using equation and align environments from amsmath. Use \label and \cref for cross-references. Define notation in a paragraph before equations.

Tables

Use booktabs for publication-quality rules: \toprule, \midrule, \bottomrule. Never use vertical lines. Place tables with [H] (float package) or [t] for top-of-page.

Figures

Include with \includegraphics and caption. Use subcaption for panels (a), (b), (c). Export vector graphics as PDF for lossless scaling. Name files descriptively.

Bibliography

Use biblatex with \printbibliography or the journal's required BST file with \bibliography. Export references from Zotero or Google Scholar as BibTeX.

\section{Figures and Tables}

Publication-quality figures & tables

Copy these patterns for single figures, subfigure panels, and booktabs tables.

Figures — single and subfigure panels
% Single figure
\begin{figure}[t]
  \centering
  \includegraphics[width=0.9\linewidth]{figures/architecture.pdf}
  \caption{Overview of the proposed model architecture. The encoder
    processes input tokens through hierarchical attention blocks
    (\cref{sec:method}).}
  \label{fig:architecture}
\end{figure}

% Two-panel subfigures
\begin{figure}[t]
  \centering
  \begin{subfigure}[b]{0.48\linewidth}
    \includegraphics[width=\linewidth]{figures/train_loss.pdf}
    \caption{Training loss curve.}
  \end{subfigure}
  \hfill
  \begin{subfigure}[b]{0.48\linewidth}
    \includegraphics[width=\linewidth]{figures/eval_rouge.pdf}
    \caption{ROUGE-1 on validation set.}
  \end{subfigure}
  \caption{Training dynamics over 50k steps.}
  \label{fig:training}
\end{figure}
Tables — booktabs with siunitx alignment
\begin{table}[H]
  \centering
  \caption{Comparison of methods on the CNN/DailyMail test set.}
  \label{tab:main-results}
  \begin{tabular}{l S[table-format=2.1] S[table-format=2.1] S[table-format=2.1]}
    \toprule
    {Method} & {R-1} & {R-2} & {R-L} \\
    \midrule
    Lead-3        & 40.4 & 17.7 & 36.7 \\
    Longformer    & 44.2 & 21.3 & 41.0 \\
    Full Attention & 45.1 & 21.9 & 41.8 \\
    \midrule
    \textbf{Ours} & \textbf{47.4} & \textbf{23.1} & \textbf{43.5} \\
    \bottomrule
  \end{tabular}
\end{table}

\include{preamble}

Complete research paper preamble

A complete, compiling article with math, table, and bibliography. Works out of the box with FormaTeX using the pdfLaTeX + Biber pipeline.

paper.tex
\documentclass[12pt]{article}

% Encoding & fonts
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{lmodern}

% Page layout
\usepackage[margin=2.5cm]{geometry}

% Mathematics
\usepackage{amsmath, amssymb, amsthm}
\usepackage{mathtools}

% Figures & tables
\usepackage{graphicx}
\usepackage{booktabs}
\usepackage{array}
\usepackage[labelfont=bf, font=small]{caption}
\usepackage{subcaption}
\usepackage{float}

% References & citations
\usepackage[style=numeric-comp, sorting=none, backend=biber]{biblatex}
\addbibresource{references.bib}

% Hyperlinks
\usepackage[hidelinks, colorlinks=false]{hyperref}
\usepackage{cleveref}

% Algorithms
\usepackage[ruled, vlined]{algorithm2e}

% Utilities
\usepackage{microtype}
\usepackage{xcolor}
\usepackage{lipsum}  % remove in production

% -------------------------------------------------------
% Document metadata
% -------------------------------------------------------
\title{%
  A Sub-Quadratic Attention Mechanism for\\
  Long-Document Summarisation
}
\author{%
  Jane Researcher$^{1}$\thanks{Corresponding author: [email protected]} \and
  John Co-Author$^{2}$
}
\date{%
  $^{1}$Department of Computer Science, University of Example\\
  $^{2}$AI Research Lab, Tech Institute\\[6pt]
  \today
}

\begin{document}

\maketitle

\begin{abstract}
We present an efficient attention mechanism that reduces the computational
complexity of transformer self-attention from $\mathcal{O}(n^2)$ to
$\mathcal{O}(n \log n)$, enabling processing of documents with up to
64{,}000 tokens on a single GPU. Experiments on the CNN/DailyMail and
arXiv summarisation benchmarks show a ROUGE-1 improvement of 2.3 points
over the full-attention baseline with 40\% lower memory consumption.
\end{abstract}

\textbf{Keywords:} natural language processing, attention mechanism,
transformers, document summarisation

\section{Introduction}
Long-document understanding remains a core challenge in NLP.
The quadratic memory complexity of standard self-attention
($\mathcal{O}(n^2 d)$ for sequence length $n$ and dimension $d$)
limits practical sequence lengths to 2{,}048–4{,}096 tokens on
commodity hardware~\cite{vaswani2017}.

\subsection{Contributions}
\begin{enumerate}
  \item A hierarchical attention scheme operating at sentence and paragraph levels.
  \item An open-source implementation evaluated on three public benchmarks.
  \item A theoretical analysis proving the $\mathcal{O}(n \log n)$ bound.
\end{enumerate}

\section{Related Work}
Sparse attention patterns were introduced by~\cite{child2019} and extended
by Longformer~\cite{beltagy2020} with sliding-window plus global tokens.
Linear attention approximations~\cite{katharopoulos2020} achieve $\mathcal{O}(n)$
complexity but sacrifice expressiveness on local patterns.

\section{Methodology}
\subsection{Hierarchical Attention}
Let $\mathbf{X} \in \mathbb{R}^{n \times d}$ be the token embeddings.
We partition $\mathbf{X}$ into $k$ sentence blocks and compute:
\begin{equation}
  \mathbf{H}_i = \text{Attention}\bigl(\mathbf{Q}_i,\, \mathbf{K}_i,\, \mathbf{V}_i\bigr),
  \quad i = 1, \ldots, k
  \label{eq:local-attn}
\end{equation}
where each block attends only within its sentence boundary for local features,
followed by a cross-block pooling step for global context.

\subsection{Complexity Analysis}
For $k$ blocks each of size $m = n/k$:
\begin{equation}
  \mathcal{C} = k \cdot \mathcal{O}(m^2) + \mathcal{O}(k^2)
               = \mathcal{O}\!\left(\frac{n^2}{k}\right) + \mathcal{O}(k^2)
\end{equation}
Setting $k = n^{2/3}$ minimises total complexity to $\mathcal{O}(n^{4/3})$,
and with $k = \sqrt{n}$ we achieve $\mathcal{O}(n \log n)$ in expectation.

\section{Experiments}
\subsection{Datasets and Metrics}
We evaluate on CNN/DailyMail~\cite{see2017} and arXiv~\cite{cohan2018}.
Summaries are scored with ROUGE-1, ROUGE-2, and ROUGE-L.

\begin{table}[H]
  \centering
  \caption{ROUGE scores on CNN/DailyMail test set.}
  \label{tab:results}
  \begin{tabular}{lccc}
    \toprule
    Model              & R-1  & R-2  & R-L  \\
    \midrule
    Lead-3 baseline    & 40.4 & 17.7 & 36.7 \\
    Longformer         & 44.2 & 21.3 & 41.0 \\
    Full Attention     & 45.1 & 21.9 & 41.8 \\
    \textbf{Ours}     & \textbf{47.4} & \textbf{23.1} & \textbf{43.5} \\
    \bottomrule
  \end{tabular}
\end{table}

\section{Conclusion}
We demonstrated that hierarchical attention matches or exceeds full-attention
performance at substantially lower computational cost. Future work will
explore integration with retrieval-augmented generation pipelines.

\printbibliography

\end{document}

Try the template in the browser

The full research paper — with abstract, math, table, and bibliography — is pre-loaded and compiles in seconds. No installation, no setup.

\section{FAQ}

Frequently asked questions

Should I use article, IEEEtran, or REVTeX for my paper?

Use article for arXiv preprints and journals without a required class. Use IEEEtran for any IEEE venue (Transactions, Letters, or conferences like CVPR). Use revtex4-2 for American Physical Society journals (Physical Review, PRL). Always check the journal's author guidelines first.

How do I format equations in a two-column layout?

In two-column documents (IEEEtran, revtex), wide equations can break columns. Use the figure* or table* environments with an asterisk for full-width content, or the strip environment in revtex. For inline equations, microtype handles spacing automatically.

What is the best way to manage references for a paper?

Use biblatex with numeric-comp style for a clean [1, 2, 3] citation format. Export your library from Zotero or Mendeley as .bib. For IEEE submissions, use IEEEtran.bst with the classic BibTeX workflow, since the journal's LaTeX template usually ships with it.

How do I make publication-quality tables in LaTeX?

Use the booktabs package and replace all \hline calls with \toprule, \midrule, and \bottomrule. Remove all vertical lines. Add column alignment via the siunitx S column type to align numbers at the decimal point.

Which engine should I use — pdfLaTeX, XeLaTeX, or LuaLaTeX?

pdfLaTeX is fastest and required by most journals. XeLaTeX is best when you need custom OpenType fonts. LuaLaTeX offers the most flexibility (including Lua scripting) but compiles slowest. FormaTeX supports all three — see /engines for a full comparison.

How do I submit to arXiv?

arXiv requires a flat source upload: your .tex file, all .bib/.bbl files, and all figure files in the same directory. Run pdflatex → biber → pdflatex × 2 locally first to generate the .bbl, then upload the .tex + .bbl + figures. FormaTeX produces compliant PDFs automatically.

Ready to write your paper?

Open the template in FormaTeX, write your paper, and compile to PDF instantly — no TeX Live installation, no local setup required.

Create an account

One quick thing

We track anonymous usage — page views, feature usage, compilation events — to understand what works and what doesn't. No ads, no personal data, no third-party sharing.

Cookie policy