Seminal AI Research

A curated collection of influential papers that have shaped the field of Artificial Intelligence.

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin

Published on June 12, 2017

Read Paper
Introduced the Transformer architecture, which is the foundation for most modern large language models.
Transformer
NLP
Architecture
Generative Adversarial Nets

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio

Published on June 10, 2014

Read Paper
Proposed the GAN framework, a novel way to train generative models, leading to breakthroughs in image generation.
GANs
Generative Models
Computer Vision
Deep Residual Learning for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Published on December 10, 2015

Read Paper
Introduced residual networks (ResNets), enabling the training of much deeper neural networks than previously possible.
ResNet
Computer Vision
Deep Learning
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova

Published on October 11, 2018

Read Paper
A powerful language representation model that considers the full context of a word by looking at the text before and after it.
BERT
NLP
Language Model
Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al.

Published on May 28, 2020

Read Paper
Introduced GPT-3 and demonstrated that large language models can perform a variety of tasks without fine-tuning.
GPT-3
LLM
Few-Shot Learning
Mastering the game of Go with deep neural networks and tree search

David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, et al.

Published on January 27, 2016

Read Paper
Detailed the AlphaGo system, which defeated a world champion Go player, a landmark achievement for AI.
AlphaGo
Reinforcement Learning
Game AI
Denoising Diffusion Probabilistic Models

Jonathan Ho, Ajay Jain, Pieter Abbeel

Published on June 16, 2020

Read Paper
A foundational paper on diffusion models which have become state-of-the-art for high-quality image generation.
Diffusion Models
Generative Models
Image Generation
Learning Transferable Visual Models From Natural Language Supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, et al.

Published on February 26, 2021

Read Paper
Introduced CLIP, a model that learns visual concepts from natural language, enabling powerful zero-shot image classification.
CLIP
Multimodal AI
Computer Vision
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, et al.

Published on October 22, 2020

Read Paper
Applied the Transformer architecture directly to images, challenging the dominance of CNNs in computer vision.
Vision Transformer
ViT
Computer Vision
Human-level control through deep reinforcement learning

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, et al.

Published on February 25, 2015

Read Paper
The Deep Q-Network (DQN) paper that demonstrated an AI learning to play Atari games from raw pixel data.
DQN
Reinforcement Learning
Deep Learning
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V. Le, Denny Zhou

Published on January 28, 2022

Read Paper
Showed that prompting LLMs to generate a series of intermediate reasoning steps improves their performance on complex tasks.
Prompt Engineering
LLM
Reasoning
ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton

Published on December 3, 2012

Read Paper
The AlexNet paper, which kickstarted the deep learning revolution by winning the ImageNet competition.
AlexNet
CNN
Computer Vision
Adam: A Method for Stochastic Optimization

Diederik P. Kingma, Jimmy Ba

Published on December 22, 2014

Read Paper
Introduced the Adam optimization algorithm, which became the default optimizer for training deep neural networks.
Optimization
Training
Algorithm
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Alec Radford, Luke Metz, Soumith Chintala

Published on November 19, 2015

Read Paper
Introduced DCGANs, a stable GAN architecture that was widely adopted for generating realistic images.
DCGAN
GANs
Computer Vision
Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, Oriol Vinyals, Quoc V. Le

Published on September 10, 2014

Read Paper
A foundational paper on seq2seq models using LSTMs, which became a standard for machine translation and other NLP tasks.
Seq2Seq
LSTM
NLP
High-Resolution Image Synthesis with Latent Diffusion Models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Björn Ommer

Published on December 20, 2021

Read Paper
The paper behind Stable Diffusion, which applies diffusion models in a latent space for efficient high-resolution image synthesis.
Stable Diffusion
Latent Diffusion
Image Generation
Constitutional AI: Harmlessness from AI Feedback

Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, et al.

Published on December 15, 2022

Read Paper
Proposed a method for training a harmless AI assistant by having it learn from its own critique of its responses based on a constitution.
AI Safety
Alignment
LLM
A Neural Algorithm of Artistic Style

Leon A. Gatys, Alexander S. Ecker, Matthias Bethge

Published on August 26, 2015

Read Paper
Introduced neural style transfer, an algorithm that can separate and recombine the content and style of images.
Style Transfer
Art
Computer Vision
ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R. Narasimhan, Yuan Cao

Published on October 6, 2022

Read Paper
A paradigm that prompts LLMs to generate both reasoning traces and task-specific actions, enabling them to solve complex tasks.
Agents
Reasoning
LLM
Playing Atari with Deep Reinforcement Learning

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller

Published on December 19, 2013

Read Paper
The original DQN paper presented at NIPS 2013, a precursor to the 2015 Nature paper.
DQN
Reinforcement Learning
Atari
Training language models to follow instructions with human feedback

Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, et al.

Published on March 4, 2022

Read Paper
The paper detailing InstructGPT, which uses Reinforcement Learning from Human Feedback (RLHF) to align language models.
RLHF
Instruction Following
Alignment
Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio

Published on September 1, 2014

Read Paper
Introduced an attention mechanism to seq2seq models, a key step towards the Transformer architecture.
Attention
NLP
Machine Translation
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun

Published on June 4, 2015

Read Paper
A seminal paper in object detection that introduced Region Proposal Networks for fast and accurate object localization.
Object Detection
R-CNN
Computer Vision
Auto-Encoding Variational Bayes

Diederik P. Kingma, Max Welling

Published on December 20, 2013

Read Paper
Introduced the Variational Autoencoder (VAE), a popular type of generative model.
VAE
Generative Models
Autoencoder
GLoVe: Global Vectors for Word Representation

Jeffrey Pennington, Richard Socher, Christopher D. Manning

Published on October 17, 2014

Read Paper
An influential word embedding technique that learns vector representations from a word-word co-occurrence matrix.
Word Embeddings
GloVe
NLP
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré

Published on May 27, 2022

Read Paper
A fast and memory-efficient implementation of attention, crucial for training and serving long-context Transformers.
Transformer
Optimization
Performance
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov

Published on July 26, 2019

Read Paper
Showed that BERT was significantly undertrained and proposed an improved recipe for pretraining models.
RoBERTa
BERT
NLP
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, et al.

Published on December 5, 2017

Read Paper
Introduced AlphaZero, which mastered chess, shogi, and Go from scratch with no human data, only the rules of the game.
AlphaZero
Reinforcement Learning
Self-Play
Palm: Scaling Language Modeling with Pathways

Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, et al.

Published on April 5, 2022

Read Paper
Detailed Google's 540-billion parameter PaLM model and showed continued performance gains from scaling.
PaLM
LLM
Scaling
Zero-Shot Text-to-Image Generation

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, Ilya Sutskever

Published on February 26, 2021

Read Paper
The paper that introduced the DALL-E model for generating images from text descriptions.
DALL-E
Text-to-Image
Generative Models
The Unreasonable Effectiveness of Recurrent Neural Networks

Andrej Karpathy

Published on May 21, 2015

Read Paper
A famous blog post that provided an accessible and influential demonstration of the power of RNNs and LSTMs.
RNN
LSTM
Blog Post
You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi

Published on June 8, 2015

Read Paper
Introduced YOLO, an extremely fast object detection model that frames detection as a single regression problem.
YOLO
Object Detection
Computer Vision