AI model behavior, versioned

Sunday, April 19, 2026 less than 1 minute read

Via Simon Willison, who has turned Anthropic’s published system prompts into a git archive that diffs changes across model releases:

Anthropic is the sole major AI laboratory publishing system prompts for consumer-facing chat interfaces, with archives extending back to Claude 3.

Worth noting that if you’re using the API you might be writing your own system prompt anyway, so this mostly matters for claude.ai users. The harder problem is underlying model behavior: when a researcher publishes results generated with Claude Opus 4.6, what would it take to reproduce that in two years? Code Ocean does something like this for computational environments — pin the entire runtime alongside the paper, executable on demand, and Nature has integrated it into peer review. Nobody is doing the equivalent for AI model versions in research workflows yet.

Direct Link

Share on

LinkedIn Email Mastodon Bluesky

Dave Flanagan

AI model behavior, versioned

Share on

You May Also Enjoy

The methods section is not a recipe

The Median Is Not a Discovery 🔗

Scientific datasets are riddled with copy-paste errors 🔗

What does Opus 4.7 verify against?