Posts by Tag

GenAI

The inbox apocalypse 🔗

less than 1 minute read

I have started talking about the inbox apocalypse that is going to hit this year, where everything that is normally sort of reviewed and bottlenecked by h...

Publishing’s two jobs

2 minute read

There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...

Agents, bugs, and the statistical editor

1 minute read

Nicholas Carlini, a research scientist at Anthropic, ran a simple bash script that looped over every file in the Linux kernel and asked Claude Code to look f...

Stop Building AI Agents 🔗

less than 1 minute read

Hugo-Bowne Anderson1 argues that agentic workflows shouldn’t be your first choice because of their increased complexity and instability. Remember that GenAI ...

Back to Top ↑

research integrity

Provenance, not detection

2 minute read

Researchers recently published a method for removing Google’s SynthID watermarks from AI-generated images with near-invisible quality loss, by reverse-engine...

Ground truth is reality

1 minute read

Anthropic restricted Claude Mythos to vetted security researchers this week via Project Glasswing — not because it was producing false positives, but because...

Publishing’s two jobs

2 minute read

There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...

Agents, bugs, and the statistical editor

1 minute read

Nicholas Carlini, a research scientist at Anthropic, ran a simple bash script that looped over every file in the Linux kernel and asked Claude Code to look f...

Are content credentials going mainstream? 🔗

less than 1 minute read

The Content Authenticity Initiative is a collaborative effort to bring transparency to digital media. By using cryptographic signatures and standardized meta...

Back to Top ↑

AI

AI model behavior, versioned 🔗

less than 1 minute read

Via Simon Willison, who has turned Anthropic’s published system prompts into a git archive that diffs changes across model releases:

The knowledge graph as digital twin

2 minute read

A new paper from Wharton finds that LLM-generated Community Notes on X are rated more helpful than human-written ones across 108,000+ ratings. It’s a well-de...

Ground truth is reality

1 minute read

Anthropic restricted Claude Mythos to vetted security researchers this week via Project Glasswing — not because it was producing false positives, but because...

Back to Top ↑

llm

What does Opus 4.7 verify against?

2 minute read

Claude Opus 4.7 looks like a genuine step forward, and one line in the announcement caught my attention: the model “devises ways to verify its own outputs.” ...

Publishing’s two jobs

2 minute read

There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...

Agents, bugs, and the statistical editor

1 minute read

Nicholas Carlini, a research scientist at Anthropic, ran a simple bash script that looped over every file in the Linux kernel and asked Claude Code to look f...

Back to Top ↑

MCP

MCP vs. Skills 🔗

less than 1 minute read

A good breakdown of the MCP vs. Skills tradeoffs from David Mohl:

MCP lets you ship faster 🔗

less than 1 minute read

I’ve been thinking a lot about this quote from Steve Krouse (via Simon Willison):

Markitdown 🔗

less than 1 minute read

This looks like a handy package for converting documents (PDF, .docx, .pptx, and more) to .md. There’s also a MCP server so you can use it with your LLM.

Back to Top ↑

ai

The methods section is not a recipe

2 minute read

Via Ethan Mollick, a new paper on agentic reproduction of social-science results asks whether AI agents can reproduce published results from the paper and da...

What does Opus 4.7 verify against?

2 minute read

Claude Opus 4.7 looks like a genuine step forward, and one line in the announcement caught my attention: the model “devises ways to verify its own outputs.” ...

Build to learn 🔗

less than 1 minute read

Marty Cagan on the distinction between product discovery (“build to learn”) and product delivery (“build to earn”), and why AI makes the former more importan...

Math and code got there first

1 minute read

Quanta Magazine has a piece this week on how AI has changed mathematical research — AlphaEvolve, LLMs as collaborative partners, problems that used to take m...

Back to Top ↑

agents

Ground truth is reality

1 minute read

Anthropic restricted Claude Mythos to vetted security researchers this week via Project Glasswing — not because it was producing false positives, but because...

Stop Building AI Agents 🔗

less than 1 minute read

Hugo-Bowne Anderson1 argues that agentic workflows shouldn’t be your first choice because of their increased complexity and instability. Remember that GenAI ...

Back to Top ↑

peer review

The inbox apocalypse 🔗

less than 1 minute read

I have started talking about the inbox apocalypse that is going to hit this year, where everything that is normally sort of reviewed and bottlenecked by h...

Publishing’s two jobs

2 minute read

There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...

Agents, bugs, and the statistical editor

1 minute read

Nicholas Carlini, a research scientist at Anthropic, ran a simple bash script that looped over every file in the Linux kernel and asked Claude Code to look f...

Back to Top ↑

funny

Back to Top ↑

China

Back to Top ↑

python

Python: The Documentary

less than 1 minute read

Learning Python for data science seven years ago changed the trajectory of my career. This documentary is a great behind-the-scenes view of the people who br...

Back to Top ↑

generative AI

Back to Top ↑

academic publishing

The inbox apocalypse 🔗

less than 1 minute read

I have started talking about the inbox apocalypse that is going to hit this year, where everything that is normally sort of reviewed and bottlenecked by h...

Publishing’s two jobs

2 minute read

There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...

Back to Top ↑

scholarly publishing

Provenance, not detection

2 minute read

Researchers recently published a method for removing Google’s SynthID watermarks from AI-generated images with near-invisible quality loss, by reverse-engine...

Ground truth is reality

1 minute read

Anthropic restricted Claude Mythos to vetted security researchers this week via Project Glasswing — not because it was producing false positives, but because...

Back to Top ↑

knowledge-graphs

What does Opus 4.7 verify against?

2 minute read

Claude Opus 4.7 looks like a genuine step forward, and one line in the announcement caught my attention: the model “devises ways to verify its own outputs.” ...

Where graphs supplement LLMs 🔗

less than 1 minute read

Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...

Back to Top ↑

reproducibility

The methods section is not a recipe

2 minute read

Via Ethan Mollick, a new paper on agentic reproduction of social-science results asks whether AI agents can reproduce published results from the paper and da...

AI model behavior, versioned 🔗

less than 1 minute read

Via Simon Willison, who has turned Anthropic’s published system prompts into a git archive that diffs changes across model releases:

Back to Top ↑

TIL

Back to Top ↑

Python

Back to Top ↑

GitHub Actions

Back to Top ↑

PyPi

Back to Top ↑

jekyll

Back to Top ↑

GitHub Copilot

Back to Top ↑

A2A

Back to Top ↑

AWS

Back to Top ↑

Grok

Back to Top ↑

preprint

Back to Top ↑

LLM

Back to Top ↑

OpenAI

Back to Top ↑

markdown

Markitdown 🔗

less than 1 minute read

This looks like a handy package for converting documents (PDF, .docx, .pptx, and more) to .md. There’s also a MCP server so you can use it with your LLM.

Back to Top ↑

FDA

Back to Top ↑

hallucination

Back to Top ↑

DeepSeek

Back to Top ↑

education

Back to Top ↑

prompt injection

Piloting Claude for Chrome 🔗

less than 1 minute read

I’m not sure if we’re ready for agentic browser control. Yes, you can click each time to accept the risk, but how many of us read the T&Cs before we clic...

Back to Top ↑

Claude

Piloting Claude for Chrome 🔗

less than 1 minute read

I’m not sure if we’re ready for agentic browser control. Yes, you can click each time to accept the risk, but how many of us read the T&Cs before we clic...

Back to Top ↑

Microsoft

Back to Top ↑

GPT-5

Back to Top ↑

image integrity

Are content credentials going mainstream? 🔗

less than 1 minute read

The Content Authenticity Initiative is a collaborative effort to bring transparency to digital media. By using cryptographic signatures and standardized meta...

Back to Top ↑

Switzerland

Back to Top ↑

open weights

Back to Top ↑

vibe coding

Back to Top ↑

workslop

Back to Top ↑

uv

Back to Top ↑

pip

Back to Top ↑

APIs

MCP lets you ship faster 🔗

less than 1 minute read

I’ve been thinking a lot about this quote from Steve Krouse (via Simon Willison):

Back to Top ↑

AI detection

Provenance, not detection

2 minute read

Researchers recently published a method for removing Google’s SynthID watermarks from AI-generated images with near-invisible quality loss, by reverse-engine...

Back to Top ↑

content provenance

Provenance, not detection

2 minute read

Researchers recently published a method for removing Google’s SynthID watermarks from AI-generated images with near-invisible quality loss, by reverse-engine...

Back to Top ↑

knowledge graphs

The knowledge graph as digital twin

2 minute read

A new paper from Wharton finds that LLM-generated Community Notes on X are rated more helpful than human-written ones across 108,000+ ratings. It’s a well-de...

Back to Top ↑

research intelligence

The knowledge graph as digital twin

2 minute read

A new paper from Wharton finds that LLM-generated Community Notes on X are rated more helpful than human-written ones across 108,000+ ratings. It’s a well-de...

Back to Top ↑

literature-based discovery

The knowledge graph as digital twin

2 minute read

A new paper from Wharton finds that LLM-generated Community Notes on X are rated more helpful than human-written ones across 108,000+ ratings. It’s a well-de...

Back to Top ↑

NLP

Where graphs supplement LLMs 🔗

less than 1 minute read

Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...

Back to Top ↑

relation-extraction

Where graphs supplement LLMs 🔗

less than 1 minute read

Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...

Back to Top ↑

pharma

Where graphs supplement LLMs 🔗

less than 1 minute read

Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...

Back to Top ↑

arxiv

Where graphs supplement LLMs 🔗

less than 1 minute read

Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...

Back to Top ↑

drug-discovery

Math and code got there first

1 minute read

Quanta Magazine has a piece this week on how AI has changed mathematical research — AlphaEvolve, LLMs as collaborative partners, problems that used to take m...

Back to Top ↑

materials-science

Math and code got there first

1 minute read

Quanta Magazine has a piece this week on how AI has changed mathematical research — AlphaEvolve, LLMs as collaborative partners, problems that used to take m...

Back to Top ↑

self-driving-labs

Math and code got there first

1 minute read

Quanta Magazine has a piece this week on how AI has changed mathematical research — AlphaEvolve, LLMs as collaborative partners, problems that used to take m...

Back to Top ↑

product-management

Build to learn 🔗

less than 1 minute read

Marty Cagan on the distinction between product discovery (“build to learn”) and product delivery (“build to earn”), and why AI makes the former more importan...

Back to Top ↑

research-intelligence

What does Opus 4.7 verify against?

2 minute read

Claude Opus 4.7 looks like a genuine step forward, and one line in the announcement caught my attention: the model “devises ways to verify its own outputs.” ...

Back to Top ↑

publishing

AI model behavior, versioned 🔗

less than 1 minute read

Via Simon Willison, who has turned Anthropic’s published system prompts into a git archive that diffs changes across model releases:

Back to Top ↑

data quality

Back to Top ↑

research reproducibility

Back to Top ↑

open science

Back to Top ↑

research

Back to Top ↑

scientific-discovery

Back to Top ↑

chemistry

The methods section is not a recipe

2 minute read

Via Ethan Mollick, a new paper on agentic reproduction of social-science results asks whether AI agents can reproduce published results from the paper and da...

Back to Top ↑

methods

The methods section is not a recipe

2 minute read

Via Ethan Mollick, a new paper on agentic reproduction of social-science results asks whether AI agents can reproduce published results from the paper and da...

Back to Top ↑