AI model behavior, versioned
Via Simon Willison, who has turned Anthropic’s published system prompts into a git archive that diffs changes across model releases:
Anthropic is the sole major AI laboratory publishing system prompts for consumer-facing chat interfaces, with archives extending back to Claude 3.
Worth noting that if you’re using the API you might be writing your own system prompt anyway, so this mostly matters for claude.ai users. The harder problem is underlying model behavior: when a researcher publishes results generated with Claude Opus 4.6, what would it take to reproduce that in two years? Code Ocean does something like this for computational environments — pin the entire runtime alongside the paper, executable on demand, and Nature has integrated it into peer review. Nobody is doing the equivalent for AI model versions in research workflows yet.