How to Write Up a Ph.D. Dissertation
(for computer scientists and the like)

by Jason Eisner (2006)

This page is about how to turn your research (once it's done) into a readable multi-chapter document. You need to figure out what to include, how to organize it, and how to present it.

Following this advice will make me happier about reading your submitted or draft dissertation. You may find it useful even if I'm not going to read your dissertation.

Many others have written usefully on this subject, including someone in the Annals of Improbable Research. There's also advice on writing a thesis proposal. However, this page focuses on what a finished dissertation should look like. You could also skim good dissertations on the web.



What Goes Into a Dissertation?

A typical thesis will motivate why a new idea is needed, present the cool new idea, convince the reader that it's cool and new and might apply to the reader's own problems, and evaluate how well it worked. Just like a paper!

The result must be a substantial, original contribution to scientific knowledge. It signals your official entrance into the community of scholars. Treat it as an chance to make a mark, not as a 900-page-tall memorial to your graduate student life.

Beyond stapling

The cynical view is that if you've written several related papers, you staple them together to get a dissertation. That's a good first-order approximation -- you should incorporate ideas and text from your papers. But what is it missing?

First, a thesis should cohere -- ideally, it should feel like one long paper. Second, it should provide added value: there should be people who would prefer reading it to simply reading your papers. Otherwise writing it would be a meaningless exercise.

Here's what to do after stapling:

Taking Responsibility

Don't expect your advisor to be your co-author. It's your Ph.D.: you are sole author this time and the responsibility is on your shoulders. If your prose is turgid or thoughtless, misspelled or ungrammatical, oblivious or rude to related research, you're the one who looks bad.

You can do it! Your advisor and committee are basically on your side -- they're probably willing to make suggestions about content and style -- but they are not obligated to fix problems for you. They may send your dissertation back and tell you to fix it.

In the following sections, I'll start with advice about the thesis as a whole, and work downward, eventually reaching small details such as typography and citations.


Know Your Audience

First, choose your target audience. That crucial early decision will tell you what to explain, what to emphasize, and how to phrase and organize it. Checking it with your advisor might be wise.

Pretty much everything in your thesis should be relevant to your chosen audience. Think about them as you write. Ask yourself:

What does your audience already know?

A computer science thesis can freely invoke basic ideas like hash tables and computational complexity without defining or even citing them. (After all, do biologists read a computer science thesis? Not unless they are pretty comfortable with computer science.)

You can also safely assume that your readers have some prior familiarity with your research area. Just how much familiarity, and with which topics, is a judgment call -- again, you have to decide who your intended audience is.

In practice, your audience will be somewhat mixed. Up to a point, it is possible to please both beginners and experts -- by covering background material crisply and in the service of your own story. How does that work? As you lay out the motivation for your own work, and provide notation, you'll naturally have to discuss background concepts and related work. But don't give a generic review that someone else could have written! Discuss the background in a way that motivates and clarifies your ideas. Present your detailed perspective on the intellectual landscape and where your own work sits in it -- a fresh (even opinionated) take that keeps tying back to your main themes and will be useful for both experts and beginners.

In short, be as considerate as you can to beginners without interrupting the flow of your main argument to your established colleagues. A good rule of thumb is to write at the level of the most accessible papers in the journals or conference proceedings that you read.

What do you want your audience to learn from the thesis?

You should set clear goals here. Just like a paper or a talk, your dissertation needs a point: it should tell a story. Writing the abstract and chapter 1 at the start will help you work out what that story is.

You may find that you have to do further work to really support your chosen story: more experiments, more theorems, reading more literature, etc.

What does your audience hope to get out of the thesis?

Why does anyone crack open a dissertation, anyway? I sometimes do. Especially for areas that I know less well, a dissertation is often more accessible than shorter, denser papers. It takes a more leisurely pace, provides more explicit motivation and background, and answers more of the questions that I might have.

There are other reasons I might look at your dissertation:

For students, reading high-quality dissertations is a good way to learn an area and to see what a comprehensive treatment of a problem looks like. Noah A. Smith once ran a graduate CS seminar in which the students read 8 dissertations together. Each student was also required to select and summarize yet another dissertation and write a novel research proposal based on it.

Readers with different motivations may read your thesis in different ways. The strong convention is that it's a single document that must read well from start to finish -- your committee will read it that way. But it's worth keeping other readers in mind, too. Some will skim from start to finish. Some will read only the introductory and concluding chapters (so make sure those give a strong impression of what you've done and why it's important). Some will read a single chapter in the middle, going back for definitions as needed. Some will scan or search for what they need: a definition, example, table of results, or literature review. Some will flip through to get a general sense of your work or of how you think, reading whatever catches their eye.


High-Level Organization

Once you've chosen your target audience, you should outline the structure of the thesis. Again, the convention is that the document must read well from start to finish.

The "canonical organization" is sketched by Douglas Comer near the end of his advice. Read that: you'll probably want something like it. A few further tips:

Keep your focus

Keep your focus. Length is not a virtue unless the content is actually interesting. You do have as much space as you need, but the reader doesn't have unlimited time and neither do you.

Use space as needed for clarity and to flesh out and support your story. If you feel like your thesis is too short, it may need more ideas or thoughtful discussion or experiments (talk to your advisor), but it doesn't need more padding.

Get to the good stuff

A newspaper, like a dissertation, is a hefty chunk of reading. So it puts the most important news on page one, and leads each article with the most important part. You should try to do the same when reasonable.

Get to the interesting ideas as soon as possible. A good strategy is to make Chapter 1 an overview of your main arguments and findings. Tell your story there in a compelling way, including a taste of your results. Refer the reader to specific sections in later chapters for the pesky details. Chapter 1 should be especially accessible (use examples): make it the one chapter that everyone should read.

The same strategy works within a chapter. Start by telling your readers what the chapter is about and why they should read it. Then unfold your ideas and results. The order of your presentation should be natural and logical (e.g., motivation before experimental design before results), but try to keep the reader turning pages; seek reasonable ways to move the boring bits to later sections or later chapters.

Include a road map

Chapter 1 traditionally ends with a "road map" to the rest of the thesis, which rapidly summarizes what the remaining chapters or sections will contain. That's useful guidance for readers who are looking for something specific and also for those who will read the whole thesis. It also exhibits in one place what an awful lot of work you've done. Here's a detailed example.

Where to put the literature review

I recommend against writing "Chapter 2: Literature Review." Such chapters are usually boring: they're plonked down like the author's obligatory list of what he or she was "supposed" to cite. They block the reader from getting to the new ideas, and can't even be contrasted with the new ideas because those haven't been presented yet.

A better plan is to discuss related literature in conjunction with your own ideas. As you motivate and present your ideas, you'll want to refer to some related work anyway.

Related work that didn't meld naturally into that presentation can be acknowledged soon afterwards in its own section -- where you should still focus on how it relates to your ideas and fits into your framework, which you have already presented.

Each chapter might have its own related work section or sections, covering work that connects to yours in different ways.

Where to define terminology and notation

Basic terminology, concepts, and notation have to be defined somewhere. But where? You can mix the following strategies:

Retail. You can define some terms or notation individually, when the reader first needs them. Then they will be well-motivated and fresh in the reader's mind. If you use them again later, you can refer back to the section where you first defined them.

Wholesale. On the other hand, there are advantages to aggregating some of your fundamental definitions into a "Definitions" section near the start of the chapter, or a chapter near the start of the dissertation:

The downside is that such sections or chapters can seem boring and full of not-yet-motivated concepts. Unless your definitions are novel and interesting in themselves, they block the reader from getting to the new and interesting ideas. So if you write something like "Chapter 2: Preliminaries," keep it relatively concise -- the point is to get the reader oriented.

Thrift shop. Use well-known notation and terminology whenever you can, either with or without a formal definition in your thesis. The point of your thesis is not to re-invent notation or to re-present well-known material, although sometimes you may find it helpful to do so.


Make Things Easy on Your Poor Readers

Now we get down to the actual writing. A dissertation is a lot to write. But it's also an awful lot to read and digest at once! You can keep us readers turning pages and following your argument. But it's a bigger and more complicated argument than usual, so you have to be more disciplined than usual.

Break it down

Long swaths of text are like quicksand for readers (and writers!). To keep us moving without sinking, use all the devices at your disposal to break the text down into short chunks. Ironically, short chunks are more helpful in a longer document. They keep your argument tightly organized and keep the reader focused and oriented.

If a section or subsection is longer than 1 double-spaced page, consider whether you could break it down further. I'm not joking! This 1-page threshold may seem surprisingly short, but it really makes writing and reading easier. Some devices you can use:

subsectioning
Split your section into subsections (or subsubsections) with meaningful titles that keep the reader oriented.

lists
If you're writing a paragraph and feel like you're listing anything (e.g., advantages or disadvantages of some approach), then use an explicit bulleted list. Sometimes this might yield a list with only 2 or 3 rather long bullet points, but that's fine -- it breaks things down. (Note: To replace the bullets with short labels, roughly as in the list you're now reading, LaTeX's itemize environment lets you write \item[my label].)

labeled paragraphs
Label a series of paragraphs within the section, as a kind of lightweight subsectioning. Your experimental design section might look like this (using the LaTeX \paragraph command):

Participants. The participants were 32 undergraduates enrolled in ...

Apparatus. Each participant wore a Star Trek suit equipped with a Hasbro-brand Galactic Translator, belt model 3A ...

Procedure. The subjects were seated in pairs throughout the laboratory and subjected to Vogon poetry broadcast at 3-minute intervals ...

Dataset. The Vogon poetry corpus (available on request) was obtained by passing the later works of T. S. Eliot through the Systran translation system ...

footnotes
Move inessential points to footnotes. If they're too long for that, you could move them into appendices or chapters near the end of the thesis. (Here's my take on footnotes.)

captions
Move some discussion of figures and tables into their captions. Figures and tables should be clearly structured in the first place: e.g., graphs should have labeled axes with units. But a helpful caption provides guidance on how to interpret the figure or table and what interesting conclusions to draw from it. The figure or table should itself include helpful labels (axis

(In LaTeX, you can write \caption[short version]{long version}. The optional short version argument will be used for the "List of Tables" or "List of Figures" at the start of the thesis.)

theorems
Even simple formal results can be stated as a theorem or lemma. The theorem (and proof, if included) form a nice little chunk, using the LaTeX theorem enviroment.

Breaking down equations

Long blocks of equations are even more intimidating than long swaths of text. You can break those apart, too:

Now tie it back together

Now that you've chopped your prose into bite-sized chunks, what binds it together?

Coherent and explicit structure

Your paragraphs and chunks have to tie together into a coherent argument. Do everything you can to highlight the structure of this argument. The structure should jump out at the reader, making it possible to read straight through your text, or skim it. Else the reader will get stuck puzzling out what you meant and lose momentum.

Make sure your readers are never perplexed about the point of the paragraph they're reading. Make them want to keep turning the page because you've set up questions to which they want to know the answers. Don't make them rub their eyes in frustration or boredom and wander off to the fridge or the web browser.

So how exactly do you "highlight the structure" and "set up questions"?

Lots of internal cross-references

A thesis deals with a lot of ideas at once. Readers can easily lose track. Help them out:

Be concrete

As I read a thesis, or a long argument or construction within a thesis, I often start worrying whether I am keeping the pieces together correctly in my head. Something that has become deeply familiar and natural to you (the world expert) may be rougher going for me. If I can see some concrete demonstration of how your idea works, it helps me check and deepen my understanding.

Placing these concrete elements early is best, other things equal. Either embed them early in the section or just tell the reader early on to go look at Figure X. (If you continue the section by discussing Figure X, the reader is more likely to actually go look. Figure X or its caption can refer back to the text in turn.)

For example, consider pseudocode. Some readers prefer code to prose, and it's concise. So you may want to give pseudocode early in the section, before you ramble on about why it works. An alternative is to intersperse fragments of pseudocode with your prose explanation, as in literate programming. Of course, the pseudocode itself should also include some brief comments; where necessary these can just point to the text, as in "implements equation (5)" or "see section 3.2."


Mechanics

Sentences. The previous section dealt with sections and paragraphs, but how about sentences? Yours should read well. The best advice in The Elements of Style: "Omit needless words. Vigorous writing is concise." To learn how to improve your sentences, read Style: Lessons in Clarity and Grace, by Joseph M. Williams, and do the exercises. Another classic is On Writing Well, by William Zinsser.

Typography. It's nice to get the typography right. This might be a good time to read a LaTeX tutorial or book, if you don't know

Margins, spacing, title page, etc. Read JHU's submission and formatting requirements, then use these LaTeX style files provided by the library.

Citations. BibTeX is definitely worth using to manage your bibliographic database. Then I recommend formatting your citations with \usepackage[longnamesfirst]{natbib} (accompanied by \bibliographystyle{plainnat} to format the actual bibliography). The natbib package ordinarily produces reader-friendly citations such as

Computers are getting exponentially faster (Moore, 1965). However, Biddle (1971) showed ...
and is blessedly flexible enough to handle more complex forms that you'll probably need somewhere in your thesis:
Bandura's (1977) theory ...
... (e.g., Butcher, 1954; Baker, 1955; Candlestick-Maker, 1957, and others).
The work of Minor (2001, pp. 50-75; but see also Adams, 1999; Storandt, 1997) ...
According to Manning and Schütze, 1999 (henceforth M&S), ...
(It can also switch to numerical citations like [34] if you really want.)

(Another option is the apacite package, which precisely follows the style manual of the American Psychological Association. It is nearly as flexible in its citation format, but APA style has some oddities, including lowercasing the titles of proceedings volumes. One nice thing about APA style is that if you have multiple Smiths in your bibliography, it will distinguish them where necessary, using first and middle initials. Another nice thing is the use of "&" rather than "and" in author lists; however, you can easily hack plainnat.bst to mimic this behavior.)

Hyperlinks within your PDF file. I recommend including this in the LaTeX preamble:

\usepackage[colorlinks]{hyperref}
\usepackage{url}

Notes to yourself. I like to use !!! to mark something that I have to come back and finish or fill in. For longer "to do" notes to yourself, try using the cool todonotes latex package. Or for a lightweight alternative, define a \todo macro so your note appears as highlighted text in the document:

\usepackage[usenames,dvipsnames,svgnames,table]{xcolor}
\usepackage{soul}
\newcommand{\todo}[1]{\hl{[TODO: #1]}}

\todo{Either prove this or back away from the claim.  I think
   Fermat's Last Theorem might be the key ...}
To suppress all notes, change the definition to
\newcommand{\todo}[1]{}
Not all notes to yourself are to-do items that should jump out at you. You may also want to include TeX comments as documentation for your own use:
... only 58 words in the dictionary have this property.
% to get that count:
%    perl -ne 'print if blah blah' /usr/share/dict/words | wc -l

Version control. It's probably wise to use git (or CVS or RCS or Subversion or mercurial or darcs) to keep the revision history of your dissertation files. This lets you roll back to an earlier version in case of disaster. Furthermore, if you host the repository on your cs.jhu.edu account, it will be backed up by the department.

Sharing your thesis. When you're willing to open up for comments from fellow students, your advisor, or your committee, give them a secret URL from which they can always download the latest, up-to-date release of your thesis, as well as earlier versions. (This is probably friendlier than just pointing them to your git repository.)

Keep this URL up to date with your changes. Each distinct version should bear a visible date or version number, to avoid confusion. For each new version (or on request), you should probably also supply a PDF that marks up the differences from an appropriate earlier version, using the wonderful latexdiff program (available here or as an Linux package; plays nicely with git via latexdiff-git or other scripts) or a similar technique. (Note: If you use a makefile to build your document by running latex, gnuplot, etc., then you can also make it run latexdiff and update the URL for you.)

If you use Overleaf, just give your committee a view URL for your project. They will be able to see the PDF, visit different versions, and leave comments in the source file.


Planning Your Dissertation

Every dissertation is a little different. Talk to your advisor to draft a specific, written plan for what the thesis will contain, how it will be organized, and whom it will address. Discuss the plan with each of your committee members, who may suggest changes. They might disagree with advice on this page; find out.

As the dissertation takes shape, your plan may need some revision. Your advisor and committee may be willing to provide early feedback. But no one will want to slog through more than a version or two in detail. So ask them each how many drafts of each chapter they're willing to read, and in what state and on what schedule. Some of them nmay prefer to influence your writeup while it's still in an early, outline form. Others may prefer to wait until your prose is fairly polished and easy to read.

In addition to your advisor's goals and your committee's goals, you may have some goals of your own, e.g.,

GOOD LUCK!!! Now, download that LaTeX template, and take the first step toward filling it in today ...


Time Management

A little helpful advice from PhD Comics:


This page online: http://cs.jhu.edu/~jason/advice/how-to-write-a-thesis.html
Jason Eisner - jason@cs.jhu.edu (suggestions welcome) Last Mod $Date: 2019/04/23 00:46:39 $