Why it matters

AI is built on foundational papers — "Attention Is All You Need", the original GPT and BERT papers, ResNet, AlexNet, Stable Diffusion. These papers are how ideas spread and how practitioners learn what works. But paper discovery is fragmented across arXiv, Semantic Scholar, and Twitter threads. Structuring the most important papers as entities in Geo — connected to their authors, labs, and topics — makes the intellectual genealogy of AI visible and navigable.

What to publish

  • Create entities for the 200 most influential AI papers

  • For each paper, publish:

    • Title

    • Authors — link to Person entities (create if needed)

    • Abstract or summary

    • Publication date

    • Venue (NeurIPS, ICML, ICLR, CVPR, arXiv preprint, etc.)

    • arXiv URL

    • Citation count (approximate)

    • Affiliation of first author — link to Company/Institution entity

    • Key contribution (one sentence: what did this paper introduce or prove?)

    • Code repository URL if available

  • Link each paper to:

    • Authors as Person entities

    • Research lab or company — link to Company entity

    • Relevant Topics (e.g. transformers, attention, diffusion, RLHF, scaling laws)

    • Models or datasets introduced by the paper

  • Cover all major eras and areas:

    • Foundational (backpropagation, CNNs, LSTMs, word2vec)

    • Deep learning revolution (AlexNet, ResNet, batch normalization, dropout)

    • Transformers and LLMs (Attention Is All You Need, BERT, GPT series, scaling laws)

    • Generative models (GANs, VAEs, diffusion models, flow matching)

    • Alignment and safety (RLHF, Constitutional AI, InstructGPT)

    • Multimodal (CLIP, DALL-E, Flamingo, GPT-4V)

Scope

200 papers. Prioritize by citation count, practical impact, and historical significance. Include both classic foundational papers and recent high-impact work.

Potential sources

Semantic Scholar, Google Scholar, arXiv, Papers With Code, influential paper lists and surveys, conference best paper awards.