Why does Claude keep rewriting the wrong paragraph when I ask it to fix something?

Because prose descriptions force the model to infer which passage you mean, and on a long answer it usually infers wrong. The fix is to quote the exact passage alongside your note, so the mapping is unambiguous. This turns feedback from prose-the-model-interprets into paired-data-the-model-applies.

What is the best way to give an LLM feedback on multiple parts of its answer?

Quote each passage verbatim, write your note directly beneath the quote, separate each block with three dashes, and send the whole set in one message prefixed with 'apply these edits; leave everything else unchanged.' Prose descriptions, numbered lists, and multiple back-and-forth turns all fail at five or more comments. Paired passages with notes do not.

Why doesn't the chat box just have a highlight tool?

It probably will, eventually — every frontier chat interface will ship some version of multi-edit selection within the next year or two. Until then, the workaround is a separate browser-only tool that emits feedback in the format the model can parse. The primitive is missing from the chat; the primitive is not hard to provide elsewhere.

Is this safe for confidential documents?

Yes. There is no backend. Your document never leaves your browser. There is no database to store it in, no account to log into. (The site itself loads anonymous page-view analytics from Vercel; your document is never part of those pings.)

PassbackAI is a browser tool for giving a language model — Claude, ChatGPT, Cursor, Gemini — structured multi-point feedback on its last Markdown answer. You paste the answer, highlight every passage that needs work, attach a note per passage, and copy the whole set back out as one clean Markdown block. The model maps each comment to its exact passage and applies all the edits in one round-trip.

The PassbackAI Manifesto · April 2026

Five Things Wrong With Your LLM's Answer

By Elad Diamant, who got tired of typing "not that one, the other one" into Claude.

For one correction, you can type it into the chat. For five, you can't — and the moment around the fourth correction, when you decide points four and five aren't worth writing, is the moment this whole product exists for.

You asked Claude for a spec. It returned 1,800 words. It's 82% fine. The 18% that isn't is spread across six specific passages — the tone in the FAQ, a bullet under Architecture, a paragraph on retention, and three other small crimes. You want the model to fix those six things. Precisely. Without rewriting the eighty-two other percent you liked.

And yet. The field in front of you is a single text area. The same text area you use to ask the model to write the spec is the text area you're supposed to use to tell it which six passages to fix. It has no highlight tool. It has no per-passage note. It has no notion that you have a list of edits, each one anchored to a specific span of text it produced ninety seconds ago.

(More on PassbackAI — what we built, in an understandable fit of frustration — further down. First, the problem.)

The writing tax, and the moment you give up

Here's what actually happens when five things need work and the field in front of you is a text area. Watch yourself do it. You have done all four:

Point 1, in prose

You type the first one carefully. "In the bullet under Architecture — not the second one, the third — change 'users' to 'guests.'" Forty-five seconds. You re-read it. You're proud. You have four more to go.

Point 2, with the original scrolled away

You scroll up to re-find the FAQ paragraph because you can't remember the exact phrasing. The chat box loses focus. You scroll back down. The thing you were going to say is gone. You write a worse version of it. Two minutes in, two of five.

Point 3, hedged

By the third one you stop describing the location and just paraphrase the fix — "and somewhere there's a paragraph on retention that needs to say it more directly." You know, as you type it, that the model is going to pick the wrong paragraph. You send it anyway. Some shipped feedback beats none.

Points 4 and 5, abandoned

There were two more. A tone problem in the third bullet. A step that's out of order. You think, audibly, in the voice of a person who has given up: the model will probably catch those on its own. It will not. You hit send. OMG.

Notice what just happened. Nothing was wrong with the chat box's plumbing — your message went through, the model read it, the model replied. What was wrong was upstream of the chat box: the act of writing each correction, in prose, with the original answer scrolled out of view, was itself hard enough work that two of the five never got written. Three of five made it into the message. The next answer comes back wrong on the other two. You start another round. Somewhere, a senior PM is doing this twelve times a day and calling it her job.

The gap isn't in the chat box. It's in you, by the fourth correction, deciding that points four and five aren't worth the writing. For one fix, you can type it. For five, you can't — not because the chat box is broken, but because writing each one in prose, with the original scrolled away, is hard enough work that the last two get silently downgraded to the model will probably catch those. It will not. That is the gap PassbackAI closes: the moment of capitulation around point four.

What changes when the writing stops being the work

Drop the chat box, open a workspace built for the job, and four things stop happening in your head.

Point four makes it into the message. The reason it didn't before was that typing the directions to it — "in the bullet under Architecture, not the second one, the third" — cost more than the correction was worth. When the cost of raising a point drops to zero, the points you used to abandon now ship in the same paste as the rest.
You stop hedging the location. By the third correction in a chat box you were paraphrasing where the fix went, because the original had scrolled away and your working memory was full. Now the location is carried for you — verbatim. You stop writing "somewhere there's a paragraph on retention" and start writing the fix.
The model stops picking the wrong target. Not because the model got smarter — because the message it's reading no longer leaves the target ambiguous. Five corrections, five anchored passages, no "not that one, the other one" on the next turn.
One round-trip instead of four. The five corrections that used to need four turns and a fight about which bullet you meant now land in one paste. The next answer is wrong on zero of five, not three of five. The round you would have spent re-explaining is the round you don't have to spend at all.

None of this requires a smarter model. The model that wrote your 1,800-word answer is already the model that, given five clearly-anchored corrections in one paste, applies five clearly-anchored corrections in one paste. The bottleneck was never the model — it was the four minutes between you reading the answer and you giving up on points four and five. A hundred billion dollars of AI-lab valuation poured into longer context, deeper reasoning, more tool use, more agents — and the thing that decides whether your draft converges in one round or four is whether you can get the corrections out of your head without the writing tax breaking you. OMG.

We built the thing that lowers the tax. That's the whole product.

The bill comes due in 2026

Frontier models can draft a 2,000-word spec in forty seconds. They can do deep research that would have cost a $200 consultant two weeks. They will happily produce a PRD, a README, a landing-page draft, a legal memo, a sales outline — all in Markdown, all in under a minute.

Getting any of those outputs to "done" still takes humans. Specifically, it takes humans typing feedback. And the typing-feedback part is exactly where the 2026 workflow has not caught up with the 2026 generation step. The model writes 1,800 words in forty seconds; the human spends fifteen minutes typing "not that one, the other one" to try to fix six of them.

A fast, structured feedback loop on an LLM answer is not a nice-to-have anymore. It's the difference between people who iterate with AI in minutes and people who iterate in hours. Multiply by every Markdown draft every person writes in a week. The total time cost of the missing primitive is, at scale, ruinous.

What we built

Nobody plans to become the person who builds the comment layer for LLM answers. You just find yourself forty-seven minutes into arguing with a language model about whether the word "moreover" belongs in the third bullet, typing "not that one, the other one" for the fourth time, and you notice nobody is coming.

PassbackAI is the feedback layer the chat box doesn't have.

The primary loop is LLM-shaped. Your model drafted the file. You want it to fix five specific things. Paste the model's answer into a browser tab. Highlight every passage that needs work — five, eight, fifteen. Leave a note per highlight (free text or a canned label like Delete, Too long, Off tone). Click Copy. The tool emits the whole set as paired passages with notes, separated by ---, in document order. Paste that back into the chat as one message. The model applies every edit in a single round-trip. No more "not that one, the other one."

Humans are the strong second case. Your PM, your editor, your marketing lead, your legal reviewer — same workflow, same export. The tool does not care whether the next reader has a pulse.

No backend. No database. No account. The document never leaves your browser. We picked those constraints on purpose — because the thing that kills tool adoption in an LLM pipeline isn't the feature set, it's the security review. So we removed the part that needs reviewing.

The bet

Inputs get layers added to them, not replaced. The phone didn't kill email; it added a layer. Git didn't replace the compiler; it added a layer. The chat box won't be killed by whatever comes next — the chat box will get a feedback layer, and the question is only whether it ships inside the frontier chat UIs or outside them.

Somewhere at Anthropic or OpenAI, a product manager is, as you read this in April 2026, building a prototype of multi-passage selection in their chat UI. When it lands, a tool like PassbackAI will look like a quaint little footnote — a cautionary tale about an indie developer who saw a missing primitive and built a whole browser tab around it.

And yet. Here we are, today, in April 2026. The primitive is still missing. The text area still doesn't highlight. The feedback pattern that works — paired passages, separators, one round-trip — is still something every LLM user has to either type by hand for ten minutes or not type at all. The indie tool is what exists in the meantime. It will still exist the day after the feature lands in Claude, because the feature will take a year to catch up to the workflow people have already built around the pattern.

The chat box is a text area. Multi-point structured feedback on a 1,800-word answer is not prose input. Every LLM user will figure that out eventually — the only question is whether you figure it out before or after you spend another forty-seven minutes arguing about the word "moreover."