The knowledge graph as digital twin

2 minute read

A new paper from Wharton finds that LLM-generated Community Notes on X are rated more helpful than human-written ones across 108,000+ ratings. It’s a well-designed study and the result is credible — for social media fact-checking, which is what it’s testing. Whether something similar could work for scientific literature is a different question, and the answer depends entirely on what you build underneath it.

Social media claims are mostly atomic: a politician said something, a statistic is cited correctly or not, an event happened or didn’t. You can check those against a corpus. Scientific claims are relational — they assert relationships between entities distributed across thousands of papers, and the “truth” of the claim is a property of the network, not any individual document. Asking an LLM to fact-check “compound X inhibits pathway Y at therapeutic doses” requires knowing what the literature establishes about X’s mechanism, Y’s context-dependence, and whether the relevant concentrations have ever appeared in the same study. A retrieval system can find text that mentions both; it can’t tell you whether the relationship holds.

This is precisely what knowledge graphs were built for. Don Swanson demonstrated it in 1986: he found that fish oil and Raynaud’s syndrome research had never cited each other, yet traversing the relationships — fish oil inhibits platelet aggregation, platelet aggregation implicated in Raynaud’s — produced a testable hypothesis. No document stated it. The connection existed only in the graph. A clinical trial three years later confirmed it.

Thirty years on, Himmelstein et al. built Hetionet: 47,000 nodes, 2.25 million relationships, 29 biomedical databases integrated into a single graph. They used it to generate drug repurposing predictions across 209,000 compound-disease pairs. Most of those candidates couldn’t be found by searching the literature because no paper had connected them — that’s what made them candidates worth testing.

The reason I keep coming back to this is that “fact-checking” is actually the least interesting thing a knowledge graph enables. Verification looks backward: does this claim hold given what we know? Discovery looks forward: what does the structure of existing knowledge imply that nobody has tested yet? Swanson and Himmelstein were doing the second thing. An AI system built on structured biomedical knowledge could do both simultaneously — flagging claims that contradict established relationships while surfacing hypotheses that the graph supports but the literature hasn’t yet stated.

The infrastructure question is the hard one, and also the interesting one. Building a knowledge graph like Hetionet is, in a real sense, constructing a digital twin of the scientific record — a computable representation of what the literature actually establishes about how the world works. Ground truth in science is still reality, just harder to access than a test suite. A well-constructed knowledge graph is the closest thing we have to making it queryable. Agents can already find errors faster than humans can triage them — the bottleneck isn’t computation, it’s the structured representation of what science actually knows. That’s a much larger project than building a better Community Notes, and a much more valuable one.

Provenance, not detection

2 minute read

Researchers recently published a method for removing Google’s SynthID watermarks from AI-generated images with near-invisible quality loss, by reverse-engineering the resolution-dependent carrier frequencies and building a spectral codebook for direct subtraction. You can already upload an image and strip the watermark through consumer web tools. The technical sophistication has just moved into the open.

This is the pattern with individual detectors. AI writing detectors misclassify non-native English speakers as AI-generated above 61% of the time; multiple universities have stopped using them in misconduct cases as a result. Watermarks added at generation time get removed when the algorithm can be modeled. The attack surface for any single signal is always exploitable given enough motivation.

A camera that supports C2PA points at something better. It embeds a signed manifest in every photo — device, lens, timestamp, location, edit history — cryptographically signed with a private key stored in a hardware secure element and certificated at manufacture. You can strip the manifest, but you can’t forge it. There’s also a more basic difference: it’s easier to prove something is there than to prove something isn’t. A valid C2PA credential is verifiable. “No watermark detected” is an absence — and that absence means less and less as removal tools proliferate. It doesn’t ask whether the image has statistical properties consistent with AI generation; it asks whether we can verify where it came from. Criminal evidence works this way: documented chain of custody from the moment it’s read off an instrument to the moment it’s presented in court. But science doesn’t work that way, and building C2PA-equivalent provenance into the full research software stack — every Python environment, every lab instrument, every figure-rendering tool — is a long way from practical.

What science has instead of chain of custody is trust — or more precisely, the question of whether trust is warranted. The right frame for research integrity tooling isn’t “detect the artifact” — it’s figure out whether we can trust this author.

I spent time building AI/ML tools for this problem. What seems to work better is a complete profile: how many signals does this paper trigger across the detection stack, and is this paper anomalous for this author — are they writing outside their normal domain, with patterns that deviate from their prior work? When enough signals deviate, a human takes a second look.

The same lesson is sitting in the AI writing detector story. They became a fairness problem as soon as they were deployed visibly at scale, and their failure modes were immediately gamed. A profile-based approach — one that builds up a picture of what this author’s work looks like and flags when something doesn’t fit — is slower to build and harder to explain to an editorial board. But it’s the one that’s actually asking the right question.

The inbox apocalypse 🔗

less than 1 minute read

I have started talking about the inbox apocalypse that is going to hit this year, where everything that is normally sort of reviewed and bottlenecked by humans is just going to be overwhelmed and flooded with AI submissions.

— Kevin Roose, Hard Fork, Apr 3, 2026

Ground truth is reality

1 minute read

Anthropic restricted Claude Mythos to vetted security researchers this week via Project Glasswing — not because it was producing false positives, but because it was producing too many real findings for the security community to triage. Among them: a 27-year-old bug in OpenBSD and a 16-year-old bug in FFmpeg.

I wrote last week that the real design challenge isn’t building the checking agent — it’s building the triage layer around what it finds. Mythos is the confirmation at scale.

The question is whether this translates to scientific literature — and the honest answer is: messily, but yes.

Code has relatively clean ground truth. Apply a patch, run the tests, the bug is fixed or it isn’t. Scientific research doesn’t have that. A protocol that works in one lab might not work in the next; full reproducibility requires someone to actually run the experiment again. That’s expensive in ways that merging a patch is not.

But ground truth in science is still reality — it’s just harder to access than a test suite. And there’s an intermediate target that doesn’t require re-running anything. Statistical analysis is internal to the paper. Does the reported p-value follow from the sample size and test described? Do the confidence intervals match the means and standard deviations in the table? These checks don’t require experimental replication. And decade-old errors of exactly this kind are almost certainly sitting in the literature, undetected for the same reason they sat in OpenBSD — nobody was systematically looking.

The triage problem will be worse, though. Security vulnerabilities triage against a clear standard: exploitable or not, patched or not. A flagged p-value inconsistency might be a transcription error, a typo in supplemental data, or a methodological choice the agent doesn’t have context for. Each one takes longer to adjudicate than a CVE. The validation bottleneck is already real in security; in publishing it would be larger and slower and harder to staff.

Statistics seems like the place to start. The 27-year-old OpenBSD bug had a patch process waiting once someone found it. Publishing has a correction process too — it’s just not built yet for the volume an agent would generate.

Publishing’s two jobs

2 minute read

There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD research paper. Alice did it the hard way — reading carefully, debugging, getting confused, building real understanding. Bob used AI agents to skip all of that and produced an equivalent-looking output. By the metrics the institution has, they’re interchangeable. In practice, one of them knows something.

The failures are the curriculum…Every hour you spend confused is an hour you spend building the infrastructure inside your own head.

I’ve been arguing for a while that academic publishing is trying to do two jobs at once: expand the frontiers of human knowledge, and certify individuals for hiring, tenure, and promotion. Those jobs have always been in some tension — a paper that gets someone promoted isn’t necessarily a paper that advances a field — but they were compatible enough that nobody had to choose explicitly.

The Alice-and-Bob framing makes that tension concrete. If Bob’s paper passes peer review, he gets the credential. But the knowledge the field gains from his paper is built on a foundation that doesn’t include Bob actually understanding it. That matters when someone tries to build on his work and Bob can’t help them.

I’ve noticed a smaller version of this myself. A presentation built with AI help but without genuine thinking behind it holds together until someone asks a question. Then you find out quickly what you actually know. For a paper, the equivalent moment is the job interview. The science might be perfectly sound — but if Bob can’t talk about why he made the methodological choices he made, the credential stops working.

The urgency is that this isn’t a gradual shift. The volume of AI-assisted submissions is going to overwhelm human-powered editorial systems within the next year or two, and publishers will reach for AI to manage the load. At that point the two-jobs tension becomes unavoidable: the systems processing the inbox won’t be able to tell Alice from Bob either.

Maybe the answer is separate venues: one optimized for credentialing, one for genuine knowledge expansion with review processes designed to assess whether the author actually understands what they found. That’s probably not happening soon. But every publisher is already implicitly choosing which function to optimize for, every time they decide what AI assistance in manuscripts or reviews is acceptable.