If you’re associated with a domain of science(especially computer science, stats, Machine Learning/AI), building a deep and unbiased understanding of that domain not only educates you in the best possible way but also helps you envision the opportunities in that domain.
A research paper is a source of study that culminates a wide range of deep and authentic practices about a topic. From critical thinking on the problem to the research process to process/source evaluation to organization and constitution, all these genuinely-executed practices make for a good research paper.
If you’re struggling to build a habit of reading papers(like me) on a regular basis, I’ve tried to break down the whole process, talked to researchers in the field, read a bunch of papers and blogs from distinguished researchers, and jotted a bunch of techniques that you can follow.
Let’s start off by understanding what a research paper is and what it is NOT!
Definition of a research paper
A research paper is a highly congested and bland manuscript that compiles a thorough understanding of a problem/topic, a proposed solution/research along with the conditions under which it was deduced/carried out, the efficacy of the solution/research, and potential loopholes in the study.
A paper is written not only to provide an exceptional learning opportunity but also to pave the way for further advancements in the field. It helps in germinating the thought seed that can either lead to a new world of ideas or an innovative method of solving a longstanding trivial problem.
What it is NOT
There is a common notion that a research paper is a well-informed summary of a problem or topic written by means of other sources. Don’t mistake it for a book or an opinionated account of an individual’s interpretation of a particular topic.
Why should one read research papers?
What I find fascinating about reading a good research paper is that one can draw on a profound study of a topic and engage with the community on a new perspective to understand what can be achieved in and around that topic.
I work at the intersection of instructional design and data science. Learning is part of my day-to-day responsibility. If the source of my education is flawed or inefficient, I’d fail at my job in the long term and this applies to many other jobs in Science with a special focus on research.
There are majorly 3 reasons to read a research paper:
- Knowledge — Understanding the problem from the eyes of someone who has probably spent years to solve it and has taken care of all the edge cases that you might not even care to think of at the beginning.
- Exploration — Whether you have a pinpointed agenda or not, there is a very high chance that you will stumble upon an edge case or a shortcoming that is worth following up. With persistent efforts over a considerable amount of time, you can learn to utilize that knowledge into making a living.
- Research and review — One of the main reasons for writing a research paper is to further the development in the field. Researchers read papers to review them for conferences or to do a literature survey of a new field. For example, Yann LeCun’s paper on integrating domain constraints into backpropagation set the foundation of modern computer vision back in 1989 and with decades of research and development work, we have come so far where we are perfecting problems like object detection and optimizing autonomous vehicles.
Not only that, with the help of the internet, all of these reasons or benefits can be extrapolated to multiple business models. It can be an innovative state-of-the-art product, efficient service model, a content creator, or a dream job where you are solving problems that matter to you.
Goals for reading a paper — What should you read about?
The first thing to do is to figure out the motivation to read the paper. There can only be two scenarios in which you’ll want to read a paper:
- Scenario1 — You have a well-defined agenda/goal and you are deeply invested in a domain. For example, you’re an NLP practitioner and you want to learn how GPT-4 has given us a breakthrough in NLP. This is always a nice scenario to be in as it offers clarity.
- Scenario2 — You want to keep abreast of the developments in a host of areas, say how a new deep learning architecture has helped us solve a 50-year old biological problem of understanding protein structures. This is often the case for beginners or for people who consume their daily dose of news from research papers(yes, they exist!).
If you’re an inquisitive beginner with no starting point in mind, start with scenario 2, shortlist a few topics you want to read about until you find an area that you find intriguing which would eventually lead you to scenario 1.
ML Reproducibility Challenge
In addition to these generic goals, if you need an end goal for your habit-building exercise of reading research papers, you should check out the ML reproducibility challenge.
You’ll find top-class papers from world-class conferences that are worth diving deep and reproducing the results.
They conduct this challenge twice a year and they have one coming up in Spring 2021. You should study the past three versions of the challenge and I’ll write a detailed post on what to expect, how to prepare, and so on.
Now you must be wondering, how to find the right paper?
Getting Started — How to find the right paper?
In order to get some ideas around this, I reached out to my friend, Anurag Ghosh who is a researcher at Microsoft. Anurag has been working at the crossover of computer vision, machine learning, and systems engineering.
Here are a few getting started tips by him:
- Always pick an area of interest.
- Read a few good book or detailed blog posts around that topic and start diving deep by reading the papers referenced in those resources.
- Look for seminal papers around that topic. These are papers that report a major breakthrough in the field and offer a new method perspective with a huge potential for subsequent research in that field. Check out papers from the morning paper, CVF — test of time award/Helmholtz prize(if interested in computer vision).
- Check out books like computer vision by Computer Vision: Algorithms and Applications by Richard Szeliski and look for the papers referenced there.
- Have/Build a sense of community. Find people who share similar interests, join groups/subreddits/discord channels where such activities are promoted.
In addition to these invaluable tips, there are a number of web applications that I’ve shortlisted that help me narrow my search for the right papers to read:
- r/MachineLearning — there are many researchers, practitioners, and engineers who share their work along with the papers they found useful in achieving those results.
- Arxiv Sanity Preserver — built by Andrej Karpathy to accelerate research. It is a repository of 142846 papers from computer science, machine learning, systems, AI, Stats, CV, etc. It also offers a bunch of filters, powerful search functionality, and a discussion forum to make for a super useful research platform.
- Google Research — the research teams at Google are working on problems that have an impact on our routine life. They share their publications for individuals and teams to learn, contribute and expedite research. They also have a Google AI blog that you can check out.
Method of reading a paper
After you have stocked your to-read list, then comes the process of reading these papers. Remember that NOT every paper is useful to read and we need a mechanism that can help us quickly screen papers that are worth reading.
To tackle this challenge, we have a commonly used Three-Pass Approach by S. Keshav that we can use. This approach proposes to read the paper in three passes instead of starting from the beginning and diving deep till the end.
Three pass approach
- The first pass — It is a quick scan to capture a high-level view of the paper. Read the title, abstract, and introduction carefully followed by the headings of the sections and subsections and lastly the conclusion. It should take you no more than 5–10 mins to figure out if you want to move to the second pass.
- The second pass — is a more focused read without checking for the technical proofs. You take down all the crucial notes, underline the key points in the margins. Carefully study the figures, diagrams, and illustrations. Review the graphs, mark relevant unread references for further reading. It helps you understand the background of the paper.
- The third pass — reaching this pass denotes that you’ve found a paper that you want to deeply understand or review. The key to the third pass is to reproduce the results of the paper. Check it for all the assumptions and jot down all the variations in your re-implementation and the original results. Make a note of all the ideas for future analysis. It should take 5–6 hours for beginners and 1–2 hours for experienced readers.
Different tools/software to keep track of your pipeline of papers
If you’re sincere about reading research papers, your list of papers will soon overgrow into an overwhelming stack that is hard to keep track of. Fortunately, we have software that can help us in setting up a mechanism to manage our research.
Here are a bunch of them that you can use:
- Mendeley[not free] — you can add papers directly to your library from your browser, import documents, generate references & citations, collaborate with fellow researchers, and access your library from anywhere. This is mostly used by experienced researchers.
- Zotero[free & open source] — Along the same lines as Mendeley but free of cost. You can make use of all the features but with limited storage space.
- Notion — if you are just starting out and want to use something lightweight with the option to organize your papers, jot down notes and manage everything in one workspace. This might not stand anywhere in comparison with the above tools but I personally feel comfortable using Notion and I have created this board to keep track of my progress for now that you can duplicate:
⚠️ Symptoms of reading a research paper
Reading a research paper can turn out to be frustrating, challenging, and time-consuming especially when you’re a beginner. You might face the following harmless symptoms:
- You might start feeling dumb for not understanding a thing a paper says.
- Finding yourself pushing too hard to understand the math behind those proofs.
- Beating your head against the wall to wrap it around the number of acronyms used in the paper. Just kidding, you’ll have to look up those acronyms every now and then.
- Being stuck on one paragraph for more than an hour.
Here’s a complete list of emotions that you might undergo as explained by Adam Ruben in this article.
We should be all set to dive right in. Here’s a quick summary of what we have covered here:
- A research paper is a highly congested and bland manuscript that offers an in-depth explanation of a topic or problem along with the research process, proofs, explained results, and ideas for future work.
- Read research papers to develop a deep understanding of a topic/problem and then you can either review papers as part of being a researcher, explore the domain and the kind of problems to build a solution or startup around it or you can simply read them to keep abreast of the developments in your domain of interest.
- If you’re a beginner, start with exploration to soon find your path to goal-oriented research.
- In order to find good papers to read, you can use websites like arxiv-sanity, google research, and subreddits like r/MachineLearning.
- Reading approach — Use the 3-pass method to read a paper.
- Keep track of your research, notes, developments by using tools like Zotero/Notion.
- This can get overwhelming in no time. Make sure you start off easy and increment your load progressively.
Remember: Art is not a single method or step done over a weekend but a process of accomplishing remarkable results over time.