Feedforward
Personal blogs from the AI community
- Simon Willison's Weblogsimonwillison.net
Quoting Jannis Leidel
- One Useful Thingoneusefulthing.org
The Shape of the Thing
- Lossfunk Lettersletters.lossfunk.com
Making Large Language Models Speak Tulu: Structured Prompting for an Extremely Low-Resource Language
- Token for Tokenblog.jxmo.io
How to train the best embedding model in the world
- Nicholas Carlininicholas.carlini.com
How to win a best paper award
- Interconnectsinterconnects.ai
Dean Ball on open models and government control
- Ponder on Franz Louis Cesistaleloykun.github.io
Frequency Domain Muon for Convolutional Neural Networks: Simplified
- Understanding AIunderstandingai.org
The Pentagon’s bombshell deal with OpenAI, explained
- Adjacent Possibleadjacentpossible.substack.com
Back To The Garden
- Posts on Max Woolf's Blogminimaxir.com
An AI agent coding skeptic tries AI agent coding, in excessive detail
- Sebastian Raschka, PhDsebastianraschka.com
A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026
- flurries of latent creativityblog.singleton.io
Dreamer – why we built it!
- Exploring Language Modelsnewsletter.maartengrootendorst.com
The Story Behind the "RAG Pack"
- Andrej Karpathy blogkarpathy.github.io
microgpt
- AI: A Guide for Thinking Humansaiguide.substack.com
On Brian Cantwell Smith and the Promise of AI
- antifragile systemsyongzx.substack.com
CoT monitorability: why g-means and not F1?
- zhengdongwang.comzhengdongwang.com
A Straussian reading of The Adolescence of Technology
- Victoria Krakovnavkrakovna.wordpress.com
2025-26 New Year review
- Yi Tayyitay.net
2025 introspections: my year back at Google
- karpathykarpathy.bearblog.dev
2025 LLM Year in Review
- Shreya Shankarsh-reya.com
On the Consumption of AI-Generated Content at Scale
- Sam Altmanblog.samaltman.com
Sora update #1
- Nick’s Substacknickjiang.substack.com
Teaching Algorithms in Ethiopia
- Neel Nandaneelnanda.io
MATS Applications Open (Due Aug 29)
- Chris McCormickmccormickml.com
Output Latent Spaces in Multihead Attention
- Blog - Jason Weijasonwei.net
Life lessons from reinforcement learning
- atharva's blogksagar.bearblog.dev
how we accidentally solved robotics by watching 1 million hours of YouTube
- Trenton Brickentrentonbricken.github.io
My Weird PhD Journey
- Lil'Loglilianweng.github.io
Why We Think
- Sander Dielemansander.ai
Generative modelling in latent space
- jeremybernste.injeremybernste.in
Deriving Muon
- Thonk From First Principlesthonking.ai
Why PyTorch is an amazing place to work... and Why I'm Joining Thinking Machines
- Jeremy Jordanjeremyjordan.me
Training extremely large neural networks across thousands of GPUs.
- Maharshi's blogmaharshi.bearblog.dev
Learning CUDA by optimizing matrix-vector multiplication (SGEMV) for cuBLAS-like performance - A worklog
- Chip Huyenhuyenchip.com
Common pitfalls when building generative AI applications
- FOR OUR POSTERITYforourposterity.com
SITUATIONAL AWARENESS: The Decade Ahead
- ruder.ioruder.io
The Evolving Landscape of LLM Evaluation
- Jascha’s blogsohl-dickstein.github.io
Neural network training makes beautiful fractals
- Kemal Erdem Blog RSS Feederdem.pl
Step by Step visual introduction to Diffusion Models.
- Brian Kitanoblog.briankitano.com
Llama from scratch (or how to implement a paper without crying)
- siboehmsiboehm.com
Can Function Inlining Affect Floating Point Outputs? Exploring FMA and Other Consistency Issues
- peterbloem.nlpeterbloem.nl
What design can teach us
- Greg Brockmanblog.gregbrockman.com
It's time to become an ML engineer