How to Write a Literature Review


A literature review (“lit review”) is a point* in the research process where you decide how to approach a problem by combing through other peoples’ work. This process is used to gain inspiration, design experiments, and curate baselines and comparative models. A portion of the lit review goes towards the Related Works section of a paper.

*More like a continuous task done throughout the process, but it’s mostly front-loaded during project design

There are many guides on how to read papers (shoutout to Prof. Jason Eisner), and these suggestions are more geared towards paper reading for lit reviews. I propose a question-focused approach, similar to how you would test for reading comprehension. In my experience, it was easier to search through my notes than to view all my PDF annotations. These questions should be adjusted depending on the project, since for some experiments more detail is needed for dataset curation, model architecture, etc. Depending on where you are in the process, you might not be able to answer all the questions (e.g., can’t differentiate between approaches if you don’t have an approach planned yet, but you can still determine if a model is a good candidate for comparisons).

Paper Questions

  1. Summarize the paper
    • Very high-level overview like “Introduced a new BERT encoder model to represent multilingual tweets”
  2. What did they do?
    • This is a purposefully broad question that can be narrowed down depending on the project.
    • Which model did they use? BART, GPT3, T5?
    • Which datasets did they train on?
    • How did they evaluate their methods? F1, human evaluation, precision@k?
  3. How is this different from our work?
    • Focus on the technical aspects, almost as if a reviewer asked “How’s your approach different from X?”
  4. How is this relevant to our work?
    • This question is similar to the above, but should focus on how the paper’s contributions can be directly applied to your project
    • Can it be used as a baseline/comparative method? Is this a dataset paper to train or evaluate on?
    • Only a mention in the introduction for motivation? i.e., Is there an issue with the proposed method or something that we improve upon, like speed or accuracy?
    • Only a mention in the related work without comparing?

Questions 1-2 are useful for summarizing the work for later reference, and the later questions are directly prepping for a related works section. A good related works section should leave the reader understanding where your work sits in the field with respect to how it is different from other, similar work.

Tips on Finding Papers

A few tips on how to find papers for the lit review

Literature Management Tool: Zotero

Researchers all use different tools for storing and annotating papers, and I prefer storing all my notes in Zotero. It has great features like cloud storage and group libraries for collaborative note-taking. the cloud storage is small on the free tier, but is enough when you’re not storing the large PDF files and only store the notes. You can also connect specific Zotero libraries to Overleaf and import directly as a BibTex file!

A useful tool in Zotero are paper tags and the ability to connect related papers. I am still learning a workflow, but I have started to use the following custom tags:

For the related papers option, I typically only link papers if one uses the model introduced by another work or is trained on their introduced dataset. This connection makes it much easier to write a sentence like “Other work has trained on dataset X (paper1, paper2)”.

Example

Below is an example of my lit review for my Bernice: A Multilingual Pre-trained Encoder for Twitter paper. Since this was a model paper, the lit review mostly consisted of identifying the design of other models people use to represent Twitter data. The closest model to Bernice is XLM-T.

  1. Summarize the paper
    • Introduced XLM-Twitter (XLM-T), a pretrained encoder model to represent multilingual tweets
  2. What did they do?
    • Start from XLM-R (base) with a secondary pre-training on (multilingual) 198M tweets
    • Same tokenizer/vocabulary as XLM-R
    • Tweets from Twitter API from May’18 to March’20. Only kept tweets with at least 3 tokens (not counting URLs)
    • “we did not perform language filtering, aiming at capturing a general distribution”
    • Evaluated on TweetEval benchmark and the (newly created) Unified Multilingual Sentiment Analysis Benchmark (UMSAB)
  3. How is this different from our work?
    • Does not use a Twitter-specific tokenizer
    • Not pre-trained exclusively on tweets
    • Keeps the original 512 XLM-R sequence length vs our shortened 128
  4. How is this relevant to our work?
    • Model is directly comparable to ours and should be used as a comparative model
    • Evaluate on their UMSAB
    • Tags: dataset, related work, baseline model

Comments? Questions? Let me know! @Alexir563