Posts by Year

2026

What does Opus 4.7 verify against?

2 minute read

Claude Opus 4.7 looks like a genuine step forward, and one line in the announcement caught my attention: the model “devises ways to verify its own outputs.” ...

Build to learn 🔗

less than 1 minute read

Marty Cagan on the distinction between product discovery (“build to learn”) and product delivery (“build to earn”), and why AI makes the former more importan...

Math and code got there first

1 minute read

Quanta Magazine has a piece this week on how AI has changed mathematical research — AlphaEvolve, LLMs as collaborative partners, problems that used to take m...

Where graphs supplement LLMs 🔗

less than 1 minute read

Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...

MCP vs. Skills 🔗

less than 1 minute read

A good breakdown of the MCP vs. Skills tradeoffs from David Mohl:

The knowledge graph as digital twin

2 minute read

A new paper from Wharton finds that LLM-generated Community Notes on X are rated more helpful than human-written ones across 108,000+ ratings. It’s a well-de...

Provenance, not detection

2 minute read

Researchers recently published a method for removing Google’s SynthID watermarks from AI-generated images with near-invisible quality loss, by reverse-engine...

The inbox apocalypse 🔗

less than 1 minute read

I have started talking about the inbox apocalypse that is going to hit this year, where everything that is normally sort of reviewed and bottlenecked by h...

Ground truth is reality

1 minute read

Anthropic restricted Claude Mythos to vetted security researchers this week via Project Glasswing — not because it was producing false positives, but because...

Publishing’s two jobs

2 minute read

There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...

Agents, bugs, and the statistical editor

1 minute read

Nicholas Carlini, a research scientist at Anthropic, ran a simple bash script that looped over every file in the Linux kernel and asked Claude Code to look f...

Back to Top ↑

2025

MCP lets you ship faster 🔗

less than 1 minute read

I’ve been thinking a lot about this quote from Steve Krouse (via Simon Willison):

Python: The Documentary

less than 1 minute read

Learning Python for data science seven years ago changed the trajectory of my career. This documentary is a great behind-the-scenes view of the people who br...

Are content credentials going mainstream? 🔗

less than 1 minute read

The Content Authenticity Initiative is a collaborative effort to bring transparency to digital media. By using cryptographic signatures and standardized meta...

Piloting Claude for Chrome 🔗

less than 1 minute read

I’m not sure if we’re ready for agentic browser control. Yes, you can click each time to accept the risk, but how many of us read the T&Cs before we clic...

Markitdown 🔗

less than 1 minute read

This looks like a handy package for converting documents (PDF, .docx, .pptx, and more) to .md. There’s also a MCP server so you can use it with your LLM.

The bad boy of bar charts: William Playfair 🔗

less than 1 minute read

Via Clarke & Esposito, an entertaining sketch written by Mike Woodward of William Playfair, who invented bar charts and pie charts in between misadventur...

Stop Building AI Agents 🔗

less than 1 minute read

Hugo-Bowne Anderson1 argues that agentic workflows shouldn’t be your first choice because of their increased complexity and instability. Remember that GenAI ...

Finding secrets in ‘Oops’ commits 🔗

less than 1 minute read

As someone who gets confused beyond simple commits and pushes, this approach of spelunking for thought-to-be-deleted secrets in “oops” commits is a little sc...

Back to Top ↑