A Guide to Scientific Writing
Author: Quentin Fournier (edited with LLMs)
What follows is a guide to scientific writing, one I put together to help prevent the mistakes I made and often see students make. None of the advice here is my own discovery, I drew it from the writers, designers, and scientists who showed me what good looks like: Strunk & White’s writing manual, Simon Peyton Jones, Michael Black, and Neel Nanda’s blogs on writing scientific papers, Nature’s style guide and abstract recipe, the notation from The Deep Learning Book, and Edward Tufte and Dona Wong’s books on graphics. What is mine is the curation, a few twists, and the experience of getting papers in and out of review. The first part is a self-contained overview that can be read in under 30 minutes and a pre-submission checklist.
Part I - The Overview
Writing is part of the research
The single most useful belief is that writing is part of your research, not the activity that comes after it. Simon Peyton Jones puts this first in his suggestions for a reason: the act of writing forces a clarity that experiments alone cannot, and it reveals holes in your story while you can still fix them. Michael Black makes the same point from a different angle when he tells his students that the paper is the contribution: your readers do not consume your intuition or training runs, they consume sentences and figures, and so your research is only as good as your writing. The draft is a thinking tool, and the paper is your contribution to the field.
The implication is to start early. Most advisors recommend writing your abstract and introduction long before your experiments are finished, updating them as your understanding evolves. Allocate real time to writing, not the leftover hours after experiments. Rewrite the abstract five or ten times. Treat figure design as a research activity. And do not consider a result done until you can explain it clearly to someone who is not in your subfield.
None of this works if you wait for the first draft to be good. Anne Lamott’s shitty first draft is the right model: the first draft’s job is to exist, not to be good. Perfectionism freezes you, and a draft you can react to is worth more than one you have only imagined. With a draft on the page, revise in passes: structure first, claims and evidence second, figures third, sentences last. Line editing too early is a beautiful way to avoid the real work.
A final word of caution: the same urgency applies to the people. Have the author order conversation in the first weeks, not the last. Mismatched expectations, where one person assumes they will be first author and another assumes the same, are among the most uncomfortable situations in research, and the conversation only gets harder as the deadline approaches. If you are leading the project, raise it openly: who is first author, who is last, who falls in the middle, and what each person needs to contribute to hold their position. The project lead, usually the prospective first author, owns this conversation. The awkwardness of starting it is small compared to the awkwardness of not starting it.
The story
Scientific papers are not technical reports: they are stories you want to pass on to the reader. They are what you leave to the field. Michael Black takes the parallel literally. As he writes, There is a goal to be achieved (true love, the Ring, or defeating the villain) but something stands in the way. The goal is what your reader should want, the problem is what stands in the way, and the solution is what you did. Hold these three words, goal, problem, solution, over every paragraph of your draft and ask whether it earns its place.
The same logic applies at the level of the whole paper. Every paper that lands well follows the same narrative: question, gap, approach, result, implication. Five steps, each resolving the previous and setting up the next. The question is what the field cares about, framed at whatever level of generality is appropriate to the venue. The gap is what was tried before and what was missing, not a literature dump but the precise obstacle your work addresses. The approach is what you did, in the most stripped-down form that lets a reader follow the rest. The result is the empirical or theoretical outcome, with appropriate confidence. The implication is what changes about how the reader should think. ML conferences allow you to make the question fairly narrow and technical, Nature and Science demand that your question matter to a non-specialist scientist within the first sentence or two. The shape is the same, only the granularity of the framing differs. Well-organized units of thought do not require however, consequently, additionally, moreover, furthermore, or therefore to hold together, though I am guilty of leaning on them anyway.
The introduction is where this rhythm matters most. Two patterns work. Black recommends strict alternation: problem, solution, problem, solution. The other is a four-paragraph funnel, close to Swales’ Create A Research Space (CARS) model: establish a territory, establish a niche, occupy it. Paragraph 1 establishes the field and why it matters, paragraph 2 surveys the relevant literature and names the gap, paragraph 3 introduces your proposed approach, and paragraph 4 states what you actually did and what you found. Either way, the reader rides the structure down into the paper.
A useful test before drafting: write your paper in three sentences. The first names the question and why it is interesting. The second states what you did. The third states what you found and what it implies. If you cannot do this in three sentences, you have not yet carved the story out of the research.
Anatomy of a paper
ML conference papers and Nature articles may look nothing alike, but their anatomy is the same: title, abstract, introduction, methods, results, discussion, references, supplementary material. What changes are order, length, voice, and audience. Use this anatomy as a guide, not a template to fill in. The work should shape the paper, not the other way around.
An ML paper at NeurIPS, ICML, or ICLR is sectioned and structured. The page limit is around nine pages of main text plus references and an unbounded appendix. The abstract is a compact paragraph of about 150 to 250 words. The introduction ends with an explicit list of contributions, though I argue this is not always necessary when writing is well done. The related work appears either after the introduction or just before the conclusion, methods sit in the middle of the paper with full technical detail, and the experiments take up real estate proportional to the empirical claims. Reviewers are domain experts, so you can use technical vocabulary freely as long as you define non-standard terms.
A Nature or Science article is a narrative journey. The summary paragraph replaces the abstract: a single bold paragraph of roughly 150 to 200 words built to a strict five-block recipe. The main text reads like a story told to a generalist scientist, kept to a few thousand words and a handful of display items, and structured mostly as continuous prose, with sparing subheadings, rather than the ML paper’s labeled Introduction, Methods, Results, and Discussion. Figures appear inline and carry the argumentative weight. Methods, full references, and most of the technical detail live at the end, in a stand-alone Methods section plus Extended Data and Supplementary Information. Your audience is broader, a chemist must understand a biology paper, and the prose reflects that: first-person plural (we show, we find) is encouraged, and jargon is avoided or defined inline.
Section-level rules that matter most
A handful of section-level decisions account for most of the difference between a paper that is read and one that is skimmed. You cannot prevent skimming, but your writing should make the reader want to slow down and read. The most common reason students lose readers early is that they write a paper to show how much they have accomplished, not the paper their audience needs.
The title needs to say what you actually did, in words a reviewer can remember after seeing six other titles. A title that names the contribution directly works almost everywhere, and the rule has stood the test of time. My own first paper was titled Empirical Comparison Between Autoencoders and Traditional Dimensionality Reduction Methods, on my advisor’s recommendation, and I am now passing the lesson along. Catchy titles have always existed and some have stuck: Attention Is All You Need was presumptuous but turned out to be both true and fitting. Another personal favorite is Optimal Brain Damage. Acronyms work for similar reasons: BERT is short, memorable, and honest about what the model does. Yet many catchy titles oversell the work, and many acronyms have been so overused and distorted that they have become a parody of themselves.
The abstract or summary paragraph is the most-read paragraph in the paper. Treat it as such: rewrite it many times, restrict yourself to fewer than 200 words, and make sure it answers the three-sentence test above. Its structure mirrors the five-step narrative: a sentence or two on the question and why it matters, a sentence on the gap, a sentence on the approach, a sentence on the result with a concrete number where possible, and a closing line on what it means for the field. Open with substance, not a generic field claim, and define every acronym on first use. I am a perfectionist about this, sometimes to a fault, but the abstract is one place where rewriting the same sentence ten times is genuinely time well spent.
Your introduction has one job: convince the reader to read your paper. Open with the question and why it matters, sketch what was known and what was missing, state your approach, and end with an explicit list of contributions for ML conferences or a paragraph of dense narrative for Nature. The introduction may also forward-reference your headline figure, the teaser, which appears on the first page in ML papers and earns its place there. One thing I would cut without hesitation: the remainder of this paper is organized as follows paragraph. The section headings already announce the structure, the paragraph adds no information or scientific value, and most readers skim past it anyway.
Related work positioning is venue-dependent. Simon Peyton Jones argues for placing related work near the end because it lets the reader engage with your idea on its own terms first. ML papers often place it after the introduction. Both are accepted and valid. The substantive rule is that related work must be honest and positioned: characterize prior work fairly, then say what is different about yours. A wall of citations without any positioning is uninteresting to read and a missed opportunity.
Your methods need to be reproducible. ML reviewers expect enough detail that someone with access to the same compute could re-run your experiments. The NeurIPS Paper Checklist and the ML Reproducibility Checklist now formalize this. Nature methods are even more demanding because they are read as a stand-alone protocol. Either way, push hyperparameter sweeps, infrastructure details, and proofs into the appendix or supplementary material, but make sure your main text contains everything a reader needs to evaluate your claims.
A results section is a series of claim-evidence pairs. Every claim in your text must be backed by a number, a figure, or a citation, and every figure or table must serve a claim. The most common failure mode in ML results is reporting single-seed numbers. Report mean and variation across at least three seeds for any claim that matters, and explain in the text how error bars were computed. The most common failure mode in Nature results is over-narrating individual figures rather than letting the figures do the work.
The discussion or conclusion places your result in context, names your limitations honestly, and gestures toward what comes next. Reviewers should reward you for naming limitations clearly: the NeurIPS Reviewer Guidelines explicitly say so. The instinct to hide limitations is misplaced: readers find them anyway, and naming them yourself is a credibility move. The same goes for receiving reviewer comments on your work: they sting at first, but with time and a few accepted papers you start to read them as the gift they often are.
Appendices and supplementary material exist to satisfy the obsessively curious reader. Anything needed to evaluate your claims belongs in the main text. Everything else belongs here. Every appendix section should be referenced from the main text: an unreferenced section is almost always one that does not need to exist. Nature distinguishes between Extended Data (figures and tables that almost made it into the main text) and Supplementary Information (proofs, full datasets, video, code descriptions).
Prose rules that matter most
The rules in this section are defaults, not absolutes. Follow them by default. Break them deliberately, and only when you are confident the break improves the prose. Strunk gave us three that do most of the work: omit needless words, be specific, and use the active voice.
As Strunk wrote, Omit needless words. Vigorous writing is concise. This is the single most impactful rule in the book. After your draft is done, read it once with the explicit goal of cutting thirty percent without losing a thought. You will find phrases like in order to (write to), due to the fact that (write because), it should be noted that (delete), and the entire breed of academic puffery that thickens prose without adding information. Compression makes a paper feel smarter even when nothing else changes.
Another high-impact rule is to use definite, specific, concrete language. The performance improved is weak. Accuracy on ImageNet rose from 72% to 81% tells the reader something. Wherever you can, name the dataset, the metric, the number, the comparison. Specifics give the reader clarity.
Use the active voice as the default. We trained the model is shorter, clearer, and more honest than the model was trained. Nature explicitly prefers first-person plural in modern usage. Older guides that forbade it have been overruled by the journal itself. The passive is not banned. It is appropriate when the agent is unimportant or unknown, when you want to keep the topic in subject position across sentences, or when you genuinely do not want to assign credit. Use it deliberately, not by default. For years I wrote in passive voice on the assumption that older reviewers expect it, but I have come to realize that active voice can be just as formal.
Beyond Strunk’s three, a few smaller habits earn their keep. Verb tense follows a small set of rules. Use the past tense to report what happened: what you did, what an experiment showed, what a prior author wrote. Use the present tense for general truths, established facts, and statements about the paper itself: Figure 2 shows, the loss is convex, attention computes a weighted sum. Reserve future tense for genuine future work in the conclusion. Mixing tenses arbitrarily is a tell of a hurried draft.
Acronyms are friction, and your paper exists to communicate. Define every non-standard acronym on first use, then use it consistently. An acronym that appears only two or three times is not worth defining. Spell the words out instead, and reserve acronyms for terms that recur many times. Be especially ruthless about field-specific abbreviations: what is obvious to your subgroup is foreign to a reviewer outside it.
Make claims with calibration. Hedge when you are uncertain (our results suggest, we find evidence that), and do not hedge when you are not (we prove, we show). The over-hedger and the over-claimer both lose credibility, in opposite ways. I have been both at different times, and I still catch myself drifting one way or the other in late drafts. Reviewers can tell the difference between a careful sentence and a defensive one.
One last pass before you submit. When several co-authors have contributed paragraphs, the paper often reads as a Frankenstein draft with five voices instead of one. Plan a re-voicing pass at the very end: as the first author, read the whole paper through and rewrite for a single voice. The reader should not be able to tell who wrote which paragraph.
Figures earn their place
A figure that does not support an argument is a distraction. The two questions to ask of every figure are: what claim does this figure support? and can a reader who skims only the figures and captions follow the paper? The second question is a hard test, but it is the test that high-impact venues implicitly impose, because many readers will skim. Captions should therefore be self-contained: state what is plotted, define every symbol, and tell the reader what the figure shows. A caption is not a label.
Tufte’s data-ink ratio gives a practical test for the same question, and is one of my favorite principles in design. Every line, dot, gridline, or number should earn its place by adding information, and what does not should go. Pushed too far, the rule produces sparse, unreadable figures, so the question is really how much you can remove without burdening the reader. Change only when the new form is clearly easier, not just cleaner. The Visual Display of Quantitative Information is the long version, with the examples to back it up.
Choose the chart type for the data, not the data for the chart. Bars compare categories. Lines show change over an ordered axis. Scatter plots reveal relationships between two variables. Heatmaps reveal structure in a matrix. Box plots and violins show distributions. Pie charts almost never beat a bar chart. 3D bars almost always lose to 2D. If you cannot defend the chart type in one sentence, change it.
Color carries meaning, so use it semantically. Use a categorical palette like Bang Wong’s eight-color set for discrete groups, and a perceptually uniform colormap like viridis, magma, or cividis for continuous data. Avoid the jet and rainbow palettes, which both distort perception and fail colorblind readers. About eight percent of men have red-green color vision deficiency. Do not encode meaning in color alone. Pair it with shape, line style, or labels so the figure survives a black-and-white print.
Typography in figures should match the body text in size and family. Tiny axis labels and inconsistent fonts are the most common signs of a rushed figure pass. Every figure should be readable at the size it appears in print, and every panel in a compound figure should carry a label (a, b, c) referenced in the caption.
The teaser figure, Figure 1 in most ML papers, deserves disproportionate effort. It should communicate the core idea of your paper in a single image, before the reader has read a word of the introduction. Treat it as a visual abstract.
The technical side
After the story comes production. You can produce a clean paper with a small, well-chosen package set. Load microtype for typography (it reduces overfull-box warnings and improves justification almost invisibly), booktabs for tables (no vertical rules, three horizontal rules), siunitx for numbers and units that align on the decimal, cleveref for cross-references that say Figure 3 and Equation (4) automatically, hyperref for clickable links, amsmath and amssymb for math, algorithm and algpseudocode (or algorithm2e) for pseudocode, and subcaption for compound figures. Load cleveref after hyperref, and microtype after your font package.
For notation, follow the Goodfellow, Bengio, and Courville conventions: italic lowercase for scalars (x), bold lowercase for vectors (x), bold uppercase for matrices (X), and bold sans serif or calligraphic for tensors. Sets are uppercase calligraphic ($\mathcal{S}$), and probability distributions use uppercase $P$. Define a small set of macros once at the top of the file (\vx for a vector, \mX for a matrix, \E for expectation, \R for the reals) and never write a raw bold command in the body. Notation will stay consistent across drafts and co-authors.
A handful of tools save real time. arxiv-latex-cleaner (Google Research) flattens your project into a single submittable directory, strips comments, removes unused files, and silently fixes most of the things that arXiv refuses. bibtex-tidy (Flaming Tempura) cleans bibliographies in one click. The latex-tables web app builds clean tables you can paste into a tabular environment. capitalizemytitle.com enforces title case without you having to remember whether with is capitalized in AP style (it is not) or Chicago style (also not, unless it is the first or last word). Vale or LanguageTool catches the prose mistakes that pass spell-check. latexdiff produces a diff PDF between two versions of a paper and is essential for revisions and rebuttals.
For AI-assisted writing, policies are evolving and inconsistent, so check the call for papers for your specific venue. A few rules hold across them: never list an LLM as an author, disclose any substantive use, and verify every generated claim, reference, and figure against a primary source. Treat an LLM the way you would treat a fast copy editor: useful for tightening sentences and catching grammar, dangerous for generating substance you have not personally verified. Many a hallucinated reference has slipped into a submission this way. Be open with your supervisor about how you are using these tools. We are here to help you use them well, not to catch you using them.
The pre-submission checklist
Work through this list before every submission. Bracketed tags mark venue-specific items: [ML] for NeurIPS/ICML/ICLR and [Nature] for Nature/Science and similar journals. Unmarked items apply broadly. Skip what does not apply.
Story
- I can write the paper’s central claim in one sentence: we show X by doing Y, with consequence Z
- The abstract, introduction, and conclusion all state the same central claim
- The introduction names the question, the gap, the approach, the main result, and the implication
- Each contribution in the introduction is supported later by a specific figure, table, theorem, or experiment
- No claim in the abstract or conclusion is broader than the evidence shown in the paper
- The limitations section names concrete limits: datasets, settings, assumptions, scale, population, theory, or failure modes
Title and abstract
- The title says what the paper does
- The abstract is within the venue limit and is a single readable paragraph unless the venue requires structure
- The abstract contains the question, gap, approach, result, and implication
- The main result in the abstract includes a number, theorem, or concrete qualitative finding where possible
- Every acronym in the abstract is defined on first use, and avoidable acronyms have been removed
- [Nature] The summary paragraph follows the Nature recipe and contains a clear Here we show sentence or equivalent
Main text
- The first page tells the reader what is new and why it matters
- The introduction does not contain a remainder of this paper is organized as follows paragraph
- Related work is positioned: for each important cluster of prior work, the text says what it did and how this paper differs
- The method section contains enough detail for a reader to understand the approach without reading the appendix first
- Results are written as claim-evidence pairs: each result paragraph points to the figure, table, theorem, or citation that supports it
- Any statement using words like state-of-the-art, robust, general, efficient, or significant is backed by a defined comparison or test
Evidence and methods
- The paper states the datasets, splits, baselines, metrics, and evaluation protocol
- The strongest relevant baselines are included, or the text explains why they are not
- Random seeds, number of runs, and variation/error bars are reported for headline empirical claims
- Hyperparameters and selection criteria are reported in the main text or appendix
- Hardware, software versions, and approximate compute or runtime are reported where relevant
- Statistical tests define the test used, exact sample size or number of runs, error-bar meaning, and effect size or confidence interval where appropriate
- Any exclusions, failed runs, filtered data, or omitted conditions are disclosed or explicitly stated as none
Figures and tables
- Every figure and table is cited in the text
- Every figure caption states what is plotted, defines symbols/error bars, and names the takeaway
- A reader skimming only figures, tables, and captions can recover the paper’s main argument
- Axes, units, legends, panel labels, and sample sizes are readable at final print size
- Colors are colorblind-safe, and are not the only way meaning is encoded
- Tables use readable precision: no more decimal places than the uncertainty supports
- No plot uses 3D effects, misleading axis truncation, or visual decoration that changes interpretation
Prose and LaTeX
- I read the abstract, introduction, and conclusion aloud and rewrote every sentence that caused a stumble
- A concision pass removed filler phrases such as it should be noted that, due to the fact that, and in order to
- Tense is consistent: past for what was done, present for what the paper shows or what is generally true
- Notation is defined on first use and used consistently across text, equations, figures, and appendix
- All references compile: no citations, equations, figures, or sections show
?? - The PDF has no serious overfull boxes, unreadable tables, or figures that spill past margins
- The bibliography has no duplicate entries, broken capitalization for acronyms, or misspelled author names
Submission package
- Page limit, template, margins, font size, and file-size requirements match the venue
- All co-authors approved the submitted PDF, author order, affiliations, funding, and acknowledgments
- Data, code, models, and supplementary files are available to reviewers
- Ethics, competing-interest, funding, author-contribution, and AI-use disclosures follow the current venue policy
- [ML] The NeurIPS/ICML/ICLR checklist or reproducibility statement is completed honestly, with pointers to the relevant sections
- [ML] Names, acknowledgments, PDF metadata, repository links, video links, and self-citations have been anonymized according to venue rules
- [Nature] Cover letter, reporting summary, data/code availability, author contributions, competing interests, and suggested reviewers are ready if required
Personal note
Writing this guide was fun. It reminded me of the years I spent sitting down with my advisors, getting their feedback, being frustrated when I disagreed, and almost always realizing days later that they had been right. They usually are. There were rare times, though, when I turned out to be right, and my advisor’s stance shifted because of it. That two-way conversation, more than any rule in this guide, is what science actually is.
If you take one thing from this guide, take this: disagree, and bring your reasons. Your advisor is not trying to be right. They are trying to help you do better work, and they have things to learn from you too. We are figuring this out together.