Why it matters
AI is built on foundational papers — "Attention Is All You Need", the original GPT and BERT papers, ResNet, AlexNet, Stable Diffusion. These papers are how ideas spread and how practitioners learn what works. But paper discovery is fragmented across arXiv, Semantic Scholar, and Twitter threads. Structuring the most important papers as entities in Geo — connected to their authors, labs, and topics — makes the intellectual genealogy of AI visible and navigable.
What to publish
Create entities for the 200 most influential AI papers
For each paper, publish:
Title
Authors — link to Person entities (create if needed)
Abstract or summary
Publication date
Venue (NeurIPS, ICML, ICLR, CVPR, arXiv preprint, etc.)
arXiv URL
Citation count (approximate)
Affiliation of first author — link to Company/Institution entity
Key contribution (one sentence: what did this paper introduce or prove?)
Code repository URL if available
Link each paper to:
Authors as Person entities
Research lab or company — link to Company entity
Relevant Topics (e.g. transformers, attention, diffusion, RLHF, scaling laws)
Models or datasets introduced by the paper
Cover all major eras and areas:
Foundational (backpropagation, CNNs, LSTMs, word2vec)
Deep learning revolution (AlexNet, ResNet, batch normalization, dropout)
Transformers and LLMs (Attention Is All You Need, BERT, GPT series, scaling laws)
Generative models (GANs, VAEs, diffusion models, flow matching)
Alignment and safety (RLHF, Constitutional AI, InstructGPT)
Multimodal (CLIP, DALL-E, Flamingo, GPT-4V)
Scope
200 papers. Prioritize by citation count, practical impact, and historical significance. Include both classic foundational papers and recent high-impact work.
Potential sources
Semantic Scholar, Google Scholar, arXiv, Papers With Code, influential paper lists and surveys, conference best paper awards.