The Median Is Not a Discovery 🔗
Via Ethan Mollick:
Via Ethan Mollick:
Markus Englund scanned 600 datasets on Dryad and found serious copy-paste errors in 18 of them — projecting around 700 cases across the full repository of ~2...
Via Simon Willison, who has turned Anthropic’s published system prompts into a git archive that diffs changes across model releases:
Marty Cagan on the distinction between product discovery (“build to learn”) and product delivery (“build to earn”), and why AI makes the former more importan...
Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...
A good breakdown of the MCP vs. Skills tradeoffs from David Mohl:
I have started talking about the inbox apocalypse that is going to hit this year, where everything that is normally sort of reviewed and bottlenecked by h...
I’ve been thinking a lot about this quote from Steve Krouse (via Simon Willison):
Agree with this post 100%. I switched to using uv about six months ago and it has made package management in python much easier.
Let’s be considerate about how we use GenAI to write emails, articles, or blog posts. When I first started, it was fun: Wow, I can crank out a 750-word essay...
Via Simon Willison:
Is it satire? Is it an art project?
Via Simon Willison:
Switzerland released their own Llama-3-class model, trained exclusively on public sources while respecting crawler opt-out requests.
The Content Authenticity Initiative is a collaborative effort to bring transparency to digital media. By using cryptographic signatures and standardized meta...
China released their new “AI Plus” strategy document last week when I was in Beijing. Here is some context and a translation of the policy document (via Bene...
I’m not sure if we’re ready for agentic browser control. Yes, you can click each time to accept the risk, but how many of us read the T&Cs before we clic...
While many educators in the West see AI as a threat they have to manage, more Chinese classrooms are treating it as a skill to be mastered. In fact, as th...
The FDA’s head of AI, Jeremy Walsh, admitted that Elsa can hallucinate nonexistent studies. “Elsa is no different from lots of [large language models] ...
This looks like a handy package for converting documents (PDF, .docx, .pptx, and more) to .md. There’s also a MCP server so you can use it with your LLM.
A fascinating look into OpenAI the company:
The AP article quotes Simon Willison:
I haven’t had to figure out AWS IAM or review the Cost Explorer in a hot minute.
Via @ErikJonker@mastodon.social:
Paul’s post got me about 80% of the way there, but I was still having issues with
Via Clarke & Esposito, an entertaining sketch written by Mike Woodward of William Playfair, who invented bar charts and pie charts in between misadventur...
Hugo-Bowne Anderson1 argues that agentic workflows shouldn’t be your first choice because of their increased complexity and instability. Remember that GenAI ...
This looks very impressive, using LLMs to not only survey the literature but also synthesize the results and generate new statistically significant findings.
This is more of a survey than a critical review, and the equations on pages 8–10 seem unnecessary, but a potentially useful compilation map of what’s new as ...
As someone who gets confused beyond simple commits and pushes, this approach of spelunking for thought-to-be-deleted secrets in “oops” commits is a little sc...
Via Jeff Triplett:
I have started talking about the inbox apocalypse that is going to hit this year, where everything that is normally sort of reviewed and bottlenecked by h...
There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...
Nicholas Carlini, a research scientist at Anthropic, ran a simple bash script that looped over every file in the Linux kernel and asked Claude Code to look f...
Is it satire? Is it an art project?
Via Simon Willison:
Have you tried M365 Copilot lately? It has gotten seriously good.
As AI writing assistants become more prevalent in academic and professional settings, we face a growing challenge: how do we maintain the integrity of the sc...
Hugo-Bowne Anderson1 argues that agentic workflows shouldn’t be your first choice because of their increased complexity and instability. Remember that GenAI ...
Researchers recently published a method for removing Google’s SynthID watermarks from AI-generated images with near-invisible quality loss, by reverse-engine...
Anthropic restricted Claude Mythos to vetted security researchers this week via Project Glasswing — not because it was producing false positives, but because...
There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...
Nicholas Carlini, a research scientist at Anthropic, ran a simple bash script that looped over every file in the Linux kernel and asked Claude Code to look f...
The Content Authenticity Initiative is a collaborative effort to bring transparency to digital media. By using cryptographic signatures and standardized meta...
As AI writing assistants become more prevalent in academic and professional settings, we face a growing challenge: how do we maintain the integrity of the sc...
Via Ethan Mollick:
Via Simon Willison, who has turned Anthropic’s published system prompts into a git archive that diffs changes across model releases:
A new paper from Wharton finds that LLM-generated Community Notes on X are rated more helpful than human-written ones across 108,000+ ratings. It’s a well-de...
Anthropic restricted Claude Mythos to vetted security researchers this week via Project Glasswing — not because it was producing false positives, but because...
Switzerland released their own Llama-3-class model, trained exclusively on public sources while respecting crawler opt-out requests.
China released their new “AI Plus” strategy document last week when I was in Beijing. Here is some context and a translation of the policy document (via Bene...
Claude Opus 4.7 looks like a genuine step forward, and one line in the announcement caught my attention: the model “devises ways to verify its own outputs.” ...
There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...
Nicholas Carlini, a research scientist at Anthropic, ran a simple bash script that looped over every file in the Linux kernel and asked Claude Code to look f...
As AI writing assistants become more prevalent in academic and professional settings, we face a growing challenge: how do we maintain the integrity of the sc...
A good breakdown of the MCP vs. Skills tradeoffs from David Mohl:
I’ve been thinking a lot about this quote from Steve Krouse (via Simon Willison):
This looks like a handy package for converting documents (PDF, .docx, .pptx, and more) to .md. There’s also a MCP server so you can use it with your LLM.
Via @ErikJonker@mastodon.social:
Via Ethan Mollick, a new paper on agentic reproduction of social-science results asks whether AI agents can reproduce published results from the paper and da...
Claude Opus 4.7 looks like a genuine step forward, and one line in the announcement caught my attention: the model “devises ways to verify its own outputs.” ...
Marty Cagan on the distinction between product discovery (“build to learn”) and product delivery (“build to earn”), and why AI makes the former more importan...
Quanta Magazine has a piece this week on how AI has changed mathematical research — AlphaEvolve, LLMs as collaborative partners, problems that used to take m...
Anthropic restricted Claude Mythos to vetted security researchers this week via Project Glasswing — not because it was producing false positives, but because...
Via Simon Willison:
Hugo-Bowne Anderson1 argues that agentic workflows shouldn’t be your first choice because of their increased complexity and instability. Remember that GenAI ...
I have started talking about the inbox apocalypse that is going to hit this year, where everything that is normally sort of reviewed and bottlenecked by h...
There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...
Nicholas Carlini, a research scientist at Anthropic, ran a simple bash script that looped over every file in the Linux kernel and asked Claude Code to look f...
Is it satire? Is it an art project?
I haven’t had to figure out AWS IAM or review the Cost Explorer in a hot minute.
China released their new “AI Plus” strategy document last week when I was in Beijing. Here is some context and a translation of the policy document (via Bene...
While many educators in the West see AI as a threat they have to manage, more Chinese classrooms are treating it as a skill to be mastered. In fact, as th...
Agree with this post 100%. I switched to using uv about six months ago and it has made package management in python much easier.
Learning Python for data science seven years ago changed the trajectory of my career. This documentary is a great behind-the-scenes view of the people who br...
Let’s be considerate about how we use GenAI to write emails, articles, or blog posts. When I first started, it was fun: Wow, I can crank out a 750-word essay...
I have started talking about the inbox apocalypse that is going to hit this year, where everything that is normally sort of reviewed and bottlenecked by h...
There’s a piece on the ergosphere blog worth reading this week about what the author calls the Alice-and-Bob problem. Alice and Bob both produce a PhD resear...
Researchers recently published a method for removing Google’s SynthID watermarks from AI-generated images with near-invisible quality loss, by reverse-engine...
Anthropic restricted Claude Mythos to vetted security researchers this week via Project Glasswing — not because it was producing false positives, but because...
Claude Opus 4.7 looks like a genuine step forward, and one line in the announcement caught my attention: the model “devises ways to verify its own outputs.” ...
Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...
Via Ethan Mollick, a new paper on agentic reproduction of social-science results asks whether AI agents can reproduce published results from the paper and da...
Via Simon Willison, who has turned Anthropic’s published system prompts into a git archive that diffs changes across model releases:
Today I learned how to set up a complete CI/CD pipeline for Python packages using modern tooling. As a first-time package publisher, I wanted to make sure I ...
Today I learned how to set up a complete CI/CD pipeline for Python packages using modern tooling. As a first-time package publisher, I wanted to make sure I ...
Today I learned how to set up a complete CI/CD pipeline for Python packages using modern tooling. As a first-time package publisher, I wanted to make sure I ...
Today I learned how to set up a complete CI/CD pipeline for Python packages using modern tooling. As a first-time package publisher, I wanted to make sure I ...
Paul’s post got me about 80% of the way there, but I was still having issues with
Paul’s post got me about 80% of the way there, but I was still having issues with
Via @ErikJonker@mastodon.social:
I haven’t had to figure out AWS IAM or review the Cost Explorer in a hot minute.
The AP article quotes Simon Willison:
A fascinating look into OpenAI the company:
This looks like a handy package for converting documents (PDF, .docx, .pptx, and more) to .md. There’s also a MCP server so you can use it with your LLM.
The FDA’s head of AI, Jeremy Walsh, admitted that Elsa can hallucinate nonexistent studies. “Elsa is no different from lots of [large language models] ...
The FDA’s head of AI, Jeremy Walsh, admitted that Elsa can hallucinate nonexistent studies. “Elsa is no different from lots of [large language models] ...
While many educators in the West see AI as a threat they have to manage, more Chinese classrooms are treating it as a skill to be mastered. In fact, as th...
While many educators in the West see AI as a threat they have to manage, more Chinese classrooms are treating it as a skill to be mastered. In fact, as th...
I’m not sure if we’re ready for agentic browser control. Yes, you can click each time to accept the risk, but how many of us read the T&Cs before we clic...
I’m not sure if we’re ready for agentic browser control. Yes, you can click each time to accept the risk, but how many of us read the T&Cs before we clic...
Have you tried M365 Copilot lately? It has gotten seriously good.
Have you tried M365 Copilot lately? It has gotten seriously good.
The Content Authenticity Initiative is a collaborative effort to bring transparency to digital media. By using cryptographic signatures and standardized meta...
Switzerland released their own Llama-3-class model, trained exclusively on public sources while respecting crawler opt-out requests.
Switzerland released their own Llama-3-class model, trained exclusively on public sources while respecting crawler opt-out requests.
Via Simon Willison:
Let’s be considerate about how we use GenAI to write emails, articles, or blog posts. When I first started, it was fun: Wow, I can crank out a 750-word essay...
Agree with this post 100%. I switched to using uv about six months ago and it has made package management in python much easier.
Agree with this post 100%. I switched to using uv about six months ago and it has made package management in python much easier.
I’ve been thinking a lot about this quote from Steve Krouse (via Simon Willison):
Researchers recently published a method for removing Google’s SynthID watermarks from AI-generated images with near-invisible quality loss, by reverse-engine...
Researchers recently published a method for removing Google’s SynthID watermarks from AI-generated images with near-invisible quality loss, by reverse-engine...
A new paper from Wharton finds that LLM-generated Community Notes on X are rated more helpful than human-written ones across 108,000+ ratings. It’s a well-de...
A new paper from Wharton finds that LLM-generated Community Notes on X are rated more helpful than human-written ones across 108,000+ ratings. It’s a well-de...
A new paper from Wharton finds that LLM-generated Community Notes on X are rated more helpful than human-written ones across 108,000+ ratings. It’s a well-de...
Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...
Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...
Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...
Graph-based parsers appear to outperform LLMs on relation extraction — and the gap widens as relational complexity grows. A preprint out today from Gajo et a...
Quanta Magazine has a piece this week on how AI has changed mathematical research — AlphaEvolve, LLMs as collaborative partners, problems that used to take m...
Quanta Magazine has a piece this week on how AI has changed mathematical research — AlphaEvolve, LLMs as collaborative partners, problems that used to take m...
Quanta Magazine has a piece this week on how AI has changed mathematical research — AlphaEvolve, LLMs as collaborative partners, problems that used to take m...
Marty Cagan on the distinction between product discovery (“build to learn”) and product delivery (“build to earn”), and why AI makes the former more importan...
Claude Opus 4.7 looks like a genuine step forward, and one line in the announcement caught my attention: the model “devises ways to verify its own outputs.” ...
Via Simon Willison, who has turned Anthropic’s published system prompts into a git archive that diffs changes across model releases:
Markus Englund scanned 600 datasets on Dryad and found serious copy-paste errors in 18 of them — projecting around 700 cases across the full repository of ~2...
Markus Englund scanned 600 datasets on Dryad and found serious copy-paste errors in 18 of them — projecting around 700 cases across the full repository of ~2...
Markus Englund scanned 600 datasets on Dryad and found serious copy-paste errors in 18 of them — projecting around 700 cases across the full repository of ~2...
Via Ethan Mollick:
Via Ethan Mollick:
Via Ethan Mollick, a new paper on agentic reproduction of social-science results asks whether AI agents can reproduce published results from the paper and da...
Via Ethan Mollick, a new paper on agentic reproduction of social-science results asks whether AI agents can reproduce published results from the paper and da...