Claude Code Reviews with Fable

Fable came out and then disappeared a couple of days later. In that short time, I managed to run a lot of repository reviews with it, and was really happy with the results. I’ve found and fixed around 400 bugs and 400+ other issues (performance, simplifications, modernizations) across dozens of my projects, making several hundred PRs with a nearly 100% merge rate on decided PRs.

Repo code review with Claude Fable

Disclaimer

I am a core maintainer on most of these projects, and I had permission before starting from the few that I’m not a core maintainer on. Don’t do this to someone without making sure first that they are interested in it!

This is a followup to my earlier post on starting with agentic AI. I’ve been accepted into the Claude Code OSS program, so I have access to Claude Max for six months. For OSS work, that’s a ton of compute, even with the 2x multiplier on Fable, I can’t reasonably hit the limit. So I decided to try a simple prompt with Fable:

Review this project for bugs, performance, simplifications, and modernizations

I was shocked at the results. It found dozens of bugs, lots of performance fixes, and other nice cleanups on every project I tried it on. On scikit-build-core, which I wrote from scratch, it found 4 serious bugs (mostly unreleased), 11 smaller ones, a large batch of tiny ones, and 8 simplification opportunities.

For most of these, I followed it up with a prompt like this:

Put this into an issue, then open up draft PRs for these, use Sonnet or Opus
based on the task complexity. Group several into one PR when it makes sense. The
PRs should reference the issue.

(Actually, usually I /copy the response, then tell it to reference that new issue when making PRs. I always add my AI text below disclaimer).

This makes a batch of PRs. Note I didn’t have to ask it to use subagents or worktrees, it figured that out (originally I was adding instructions like that). The grouping is a matter of taste; on some repos I didn’t allow grouping, and sometimes I guided it. I very rarely had to skip a suggestion, basically just if it recommended bumping the Python floor.

I have ~/.claude/CLAUDE.md that looks like this:

If you make a commit, follow conventional commits and add a trailer:
`Assisted-by: <harness>:<model>`, where `<harness>` is the current agent harness
(like ClaudeCode), and `<model>` is the AI model (Like claude-opus-4.8). You
don't need to add a coauthored-by Claude when you have this.

Prefix PR descriptions and comments on PRs with the line ":robot: _AI text
below_ :robot:" to indicate you are an agent speaking on a user's behalf.

That’s critical to ensure proper commit trailers and keep Claude from pretending it’s me. I have similar things for OpenCode, Pi, etc.

I sometimes needed Claude to babysit the PRs; that’s simple as going back into the conversation and asking it to check CI on all the PRs; it will continue to fix until the CI goes green. I’m used to thinking of one-PR-at-a-time, but you can just as easily ask “rebase all my PRs” in a repo.

The merge rate on the PRs it opened has been nearly 100%. Due to the grouping, occasionally there were very minor removals, so if it was per-feature, I’d guess it was around 95% success rate or maybe even higher.

After Fable was taken down, I did a few more with Opus; it’s not as impressive, but still can find some easy issues and it’s still very careful to avoid false positives (Opus 4.8 is supposed to be 4x less likely to introduce bugs than 4.7, I think it’s mostly due to a system prompt change causing it to be paranoid with testing).

For small repos, Opus occasionally just starts applying fixes, which is annoying.

The runs

Here’s most of the ones I’ve done, updated 2026-07-06 with the runs since publishing — I have temporary Fable access again, so the newest unlabeled rows are Fable too. A few of the newest runs are issue-only by request:

Issue	Repo	PRs	Total
#17	aoc2023 (Opus 4.8)	0/2/0	2
#23	aoc2024 (Opus 4.8)	0/6/0	6
#4085	awkward	9/15/0	24
#759	beautifulhugo	3/15/0	18
#1143	boost-histogram	0/8/0	8
#1116	build	1/7/1	9
#1097	build (Opus 4.8)	0/9/2	11
#163	check-sdist	0/9/0	9
#2908	cibuildwheel	0/2/0	2
#2885	cibuildwheel (4.0 pre-release Opus)	0/2/0	2
#2854	cibuildwheel (Kimi-K2.6)	0/9/0	9
#1357	CLI11	2/10/0	12
#1373	CLI11 (docs, Opus)	0/0/0	0
#1578	coffea	0/0/0	0
#77	cython-cmake (Opus 4.8)	0/4/0	4
#581	decaylanguage	0/9/0	9
#48	f2py-cmake (Opus 4.8)	0/3/0	3
#86	flake8-errmsg (Opus 4.8)	0/4/0	4
#81	formulate	0/0/0	0
#381	GooFit (Opus 4.8)	3/9/0	12
#690	hist	0/5/0	5
#159	histoprint	3/0/0	3
#4	histserv (Opus)	1/5/0	6
#1132	iminuit	8/1/0	9
#23	jekyll-indico (Opus 4.8)	0/6/0	6
#861	meson-python	0/0/0	0
#731	mplhep	0/7/0	7
#1102	nox	0/11/0	11
#1239	packaging	6/8/1	15
#772	particle	0/7/0	7
#1856	pipx	0/0/0	0
#820	plumbum	3/2/0	5
#805	plumbum (Opus 4.8)	0/1/0	1
#6084	pybind11	2/3/0	5
#35	pyBumpHunter (Opus 4.8)	3/1/0	4
#2706	pyhf	10/0/0	10
#230	pyhs3	1/6/0	7
#378	pylhe	0/5/1	6
#6288	pyodide (generic)	0/6/1	7
#6278	pyodide (JS FFI only)	0/5/0	5
#376	pyodide-build	0/8/2	10
#307	pyproject-metadata	0/9/0	9
#144	ragged	6/0/0	6
#398	repo-review	0/4/0	4
#1189	scikit-build (classic, Opus 4.8)	0/3/0	3
#1317	scikit-build-core	0/6/0	6
#1363	scikit-build-core (docs, Opus 4.8)	0/4/0	4
#1401	scikit-build-core (docs pre-1.0, Opus 4.8)	0/1/0	1
#1417	scikit-build-core (pre-1.0)	0/18/0	18
#549	scikit-hep (Opus)	0/3/0	3
#228	scikit-hep-testdata (Opus)	1/4/0	5
#241	uhi	0/4/0	4
#233	uproot-browser	0/8/0	8
#1646	uproot5	13/2/0	15
#317	validate-pyproject	0/4/0	4
#711	vector	1/9/0	10

PRs column format: Open/Merged/Closed

If you have access to AI and a repository you maintain, I highly, highly recommend trying this. With Opus 4.8+ or Fable, a simple prompt is all you need. I’ve done similar things with OpenCode and Kimi K2.6, but the “someone liked the finding enough to work on it” rate was much lower, around 70%. It wasn’t high enough to want to auto-generate PRs. With open models and OpenCode or Pi, you should probably add more instructions about verifying all findings. I have not tried this with models I’m on token counts for (GPT and Gemini), since these searches are a bit pricy - Fable was taking around $20-$60 if token counting. Generating the fixes isn’t that bad, especially if you can use the simpler models - the runner model will check the subagents work.

Specific Examples (Bonus)

Made model building go from 118 seconds to 73 seconds - and this is in a project I have never worked on, a friend requested a review!
nox: a fully non-ASCII session name would wipe the whole .nox dir, deleting every other session!
CLI11: ignore_case() and ignore_underscore() each worked alone, but together didn’t! (100% coverage didn’t catch that)
Lots of cruft from removed support platforms, like a Python 2 header injected by pybind11 - that also caused error line numbers to be off by one!

Almost every review (see above) had great findings in it, feel free to browse. Those are just a few quick ones. I hope this AI age means we’ll have rock solid stable software; focus on code quality with AI instead of pumping out new features (unless you are buildling an AI harness, the dev speed on those is scary, and I guess you have to keep up).

A few other Fable uses

I didn’t just use Fable for these, I did a few other things:

Finally managed to rewrite repo-review’s webapp to run in a web worker (also using Claude Desktop, so it could view the errors directly, helped a lot!)
Replaced scikit-build’s backend with scikit-build-core - and found 8 bugs in the setuptools plugin for scikit-build-core along the way! It was like “hey, I fixed these, do you want a PR for that too?”
Reworked scikit-build-core’s test suite to save 40% wall clock time while still keeping the same coverage by reducing duplication (and more).
Tried to do a major refactor of cibuildwheel; Fable outperformed Opus here, but still wasn’t merge-ready directly from the AI.

I have other things I want to try, I hope it comes back (and is available via subscription)!

My experience with Claude Code (Bonus)

I didn’t like the look of Claude Code at first, but I found that using ccpowerline really helped. You can run npx ccpowerline, select the parts you want, adjust the options (I went full-width), and then install it into Claude Code. Here’s mine (config doesn’t seem to be sharable):

⎇ modernize | (+0,-0)    Context: [██░░░░░░░░░░░░░░] 130k/1000k (13%)       Model: Opus 4.8 | Thinking: high
cwd: /tmp/example                                               Cost: $8.21 | Session: 13.0% | Weekly: 17.0%

This adds a lot of context I need while working. I can see what folder I’m in, branch, changes, current context (admittedly less important on a 1M model than most of the open source models), and info about usage.

Things I like:

Model performance is good for Opus 4.8 and Fable, the models are really paranoid and test everything
Starts up fast - maybe the fastest startup time for a harness I’ve used (pi without plugins doesn’t count - Copilot CLI startup time is awful)
Subagents work really well (also workflows)
I just discovered /copy N, which copies the Nth response - yes! Would have been a major gripe if missing.
ccstatusline is great (asking Claude to write its own status line is not)
/review is good (but so is other harness versions)
Generally handles gh well
Lots of nice isolation options like /sandbox
/remote-control works well until my computer sleeps, not tied to GitHub repo like Copilot CLI’s version

Things I don’t like:

No standards: have to symlink CLAUDE.md to AGENTS.md, make symlinks to skills
Can’t press ctrl-p, like in OpenCode, to open a command palette. Sometimes I want to switch models while typing a prompt.
Worst /diff implementation (OpenCode’s is great, copilot is fine)
/branch then /rewind feels a lot worse than other implementations, bad session naming when you do it, too
/memory gets triggered a lot, I pretty much never want anything that’s not in AGENTS.md.
If you upgrade from Pro to Max, you have to logout then login (happened to another RSE, wasted about half a month of Max!)

Things I’m neutral on or haven’t tried

I thought I’d love hooks, but they can get Claude stuck (autofix something Claude didn’t touch, Claude undoes it, repeats).
/goal and /loop sound great, but haven’t had to use them, generally I’m doing several things and don’t mind a bit of hands-on.
/voice sounds interesting, I just don’t have interest in using it.
/radio - ummm, is this why CC is 200MB+?

I’ve been enjoing being able to run 3-4 at the same time, often running several subagents, and have been fixing bugs up to 6 years old that I’ve had the time or patience for. Having it monitor CI until it turns green, and rebase, etc. is fantastic!

AI usage disclaimer: All text was written by me. AI was used to help make the table (GLM-5.1 and Opus 4.8), review the post, and set up formatting in a couple of spots.

Categories: Python

Tags: programming python ai