I just saw these numbers in Simon Willison’s write-up:

Tool Search term Total PRs Merged PRs % Merged
Claude Code is:pr in:body “Generated with Claude Code” 146,000 123,000 84.2%
GitHub Copilot is:pr author:copilot-swe-agent[bot] 247,000 152,000 61.5%
Codex Cloud is:pr in:body “chatgpt.com” label:codex 1,900,000 1,600,000 84.2%

I switched from Claude Code to Codex two weeks ago. I do prefer Codex, but seeing it claim 13x more merged PRs than Claude still made me pause.

Before reading that as a productivity verdict, it’s worth sanity-checking the size of each user base. ChatGPT reportedly has 700M weekly users, while Claude draws about 12M (ZDNet). Coding is the top use case for Claude at 36%, whereas ChatGPT sits near 4% (decision.substack.com).

If we use those ratios as a rough proxy, Claude ends up with roughly 3.4M weekly coding users (12M * 36%) and Codex lands closer to 28M (700M * 4%). That would mean Codex has about eight times as many developers creating PRs using it. Even after accounting for that, Codex users would still appear to produce about 60% more Merged PRs than Claude users (13x vs. 8x).

This is the first cross-service metric I’ve seen that hints at Codex users shipping more merged PRs than Claude. Whether that’s down to product quality, distribution, or different usage patterns is still an open question. But it’s a striking data point either way.