| Title | : | 2025 in LLMs so far, illustrated by Pelicans on Bicycles — Simon Willison |
| Duration | : | 18:30 |
| Viewed | : | 155,247 |
| Published | : | 09-07-2025 |
| Source | : | Youtube |
What's changed in the world of LLMs since the AIE World's Fair last year? A lot! I'll be taking full advantage of my role as a fiercely independent researcher to review the past 12 months of advances in the field and catch everyone up on the latest models, free from any influence of vendors or employers. About Simon Willison Simon Willison is the creator of Datasette, an open source tool for exploring and publishing data. He currently works full-time building open source tools for data journalism, built around Datasette and SQLite. Prior to becoming an independent open source developer, Simon was an engineering director at Eventbrite. Simon joined Eventbrite through their acquisition of Lanyrd, a Y Combinator funded company he co-founded in 2010. He is a co-creator of the Django Web Framework, and has been blogging about web development and programming since 2002 at simonwillison.net Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/newsletter Timestamps: 00:00 A review of the last six months in LLMs 01:08 The "Pelican Riding a Bicycle" Benchmark 02:10 AWS Nova and Llama 3.3 70B 03:30 DeepSeek and its impact 05:42 Mistral Small 3 and the rise of local models 06:45 Claude 3.7 Sonnet and GPT 4.5 08:44 Gemini 2.5 Pro, GPT-4o, and Llama 4 11:21 GPT 4.1, O3, and O4 Mini 12:05 Claude 4 and other recent releases 14:11 Amusing and concerning LLM bugs 16:58 The power of tools and reasoning in AI 17:41 Prompt injection and the "Lethal Trifecta" 18:11 The future of the pelican benchmark
![]() |
Building and evaluating AI Agents — Sayash ... 20:00 - 0 |
![]() |
Vibes won't cut it — Chris Kelly, Augment Code 15:34 - 0 |
![]() |
12-Factor Agents: Patterns of reliable LLM appl... 17:06 - 0 |
![]() |
Keynote Speaker - Simon Willison 43:32 - 0 |
![]() |
AlphaFold - The Most Useful Thing AI Has Ever Done 24:52 - 0 |
![]() |
RAG Agents in Prod: 10 Lessons We Learned — ... 16:56 - 0 |
![]() |
AI tools for software engineers, but without th... 12:44 - 0 |
![]() |
What every AI engineer needs to know about GPUs... 19:52 - 0 |


