Tanay Biradar

🌳 Actually Reading AI Research Papers

A while ago, I wrote this post about reading AI research papers. Those were just some notes I made public; I had hardly read any papers then.

Since last September, however, things have changed. I'm taking two paper-based classes (Adversarial Robustness in Machine Learning, LLMs + Conversational AI). I'm also leading GAIA, where we read yet more papers. With these experiences under my belt, I can finally say a little more.

Reading Wide vs. Reading Deep

Harvard's CS197 suggests to split reading into two parts: Reading wide vs. reading deep. I talk more about this in my previous post.

  1. Reading wide: Use this to "learn what you don't know" and build a mental model of the topic.
  2. Read deep to get a better idea of what techniques researchers use, what benchmarks are popular, and to get context in the field.

Know Thy Field

Research papers are written by experts for experts; they're dense and full of jargon. That's why we have to "read wide" to understand the key terms—find the papers that are foundational to the field.

But reading wide is a lot more effective when you know what to read in the first place.

If you're trying to get into a field, it's useful to find someone who has experience in the field and can direct your reading. In my Adversarial Robustness in ML course, I have the good fortune to have a professor who sends a sequence of papers to read each week.

If you don't have that luxury, however, it may be useful to email someone who studies that area and ask for good papers. Alternatively, when reading wide, make a mental note of the authors that often receive citations and look up those papers. There will often be a paper trail to some works that provide excellent background! Tools like Connected Papers and Google Scholar can also be really useful to find related works.

Read Deep

When selecting papers to read deeply, try to understand as much as possible: The equations, the figures, the surrounding work. That said, don't unnecessarily bang your head against a wall.

The first few papers take a very long time to read—but as you get more familiar with the structure of research papers and the jargon of your field, the reading time drops. After reading both widely and deeply, you start to realize a few things:

  1. After reading wide, the introduction and background sections often contain information you're already familiar with. After some time, you can just skim these.
  2. The parts with the highest ROI on time are the motivation, the architecture, and (maybe) the experiments.
  3. The papers you read start to feel like a conversation. Each paper examines the state of the field, finds a flaw or limitation of existing work, and proposes something better. The authors then defend their work with experiments and highlight some places for future researchers to improve upon.
  4. You can start formulating your own questions and criticisms. Maybe you even find ideas for future work!

For a while, I couldn't read papers very well. I just read whatever seemed interesting at the time. Even after taking the advice of the CS197 lecture notes (which has definitely been useful!), I still wasn't comfortable with reading papers.

But with a reading strategy, following research suddenly became a lot more enjoyable. By knowing the field and reading deep, I've been able to skim papers for the main ideas, come up with my own ideas for research, and follow talks by researchers.

2024-02-19