Skip to content
Rarefied Earth ← Home

Field guide · AI systems · June 2026

Letting an AI write your articles without shipping its mistakes.

An AI can draft a company's articles now. The failure mode is not bad prose. It is a confident, specific, wrong claim that ships before anyone checks it. A small publishing pipeline with three mechanical gates and one human decision keeps the speed of AI drafting and removes that risk. Here is the shape, why each gate exists, and what an operator can build this week.

Posted June 16, 2026


The failure mode is not bad writing.

The cautionary case is already on the record. Starting in late 2022, the technology publication CNET quietly ran roughly 77 finance explainers produced with an AI tool, bylined only as "CNET Money Staff." When an outside writer noticed and the site Futurism reported it, editors reviewed the batch and issued corrections on 41 of the 77 articles, more than half. One explainer on compound interest told readers a $10,000 deposit at 3 percent would earn $10,300 in the first year, when the interest earned is $300. CNET paused the program and published a formal AI policy in June 2023 requiring human oversight. The prose in those articles was fine. The arithmetic was not.

That is the whole problem in one example. Modern models write clean, fluent, confident sentences. The risk lives in the specifics inside those sentences: figures, dates, prices, quotes, and citations. And the risk is not uniform. It depends almost entirely on the task. On document-summarization benchmarks, where the model only has to stay faithful to a source you handed it, the best models on Vectara's public hallucination leaderboard now sit around 1 to 3 percent. On open-ended factual recall, where the model has to produce a specific fact from memory, the same models fail far more often. A 2024 Stanford study found that leading general-purpose models hallucinated on a majority of specific legal questions. The pattern is consistent across the research: anything specific and verifiable is the high-risk part, and the more authoritative the framing ("a 2024 study found"), the more carefully it has to be checked.

So the editorial job for AI-assisted content is not to improve the writing. It is to verify the specifics and to keep anything private off the public page. Those are bounded, repeatable checks. They are exactly the kind of thing a small pipeline can enforce mechanically, which is what frees a human to spend their attention on the one judgment a machine cannot make.

The shape: draft fast, gate hard, approve once.

The pipeline has four moving parts and one rule that governs all of them.

The governing rule is simple: everything except the judgment is automated, and the judgment is never automated. The engine writes and audits without a human in the loop. The decision to ship always has a human in the loop. That split is the entire point. It is the difference between AI that produces drafts and AI that publishes unsupervised, which is the version that put CNET in the news.

Three gates a machine can actually enforce.

A gate is only useful if it is deterministic and cannot be argued with. These three are.

Gate one, voice. A literal pattern check runs over the prose and refuses anything that breaks the company's voice rules: banned punctuation, validation filler, marketing cliches, the small tells that make writing read as generic. It is a list of forbidden substrings, not a model, so it is fast, free, and identical every time. The value is not that a regular expression has taste. It is that the same standard applies to every draft, the human and the machine both, so voice does not drift one piece at a time.

Gate two, confidentiality. A second check scans the draft against a maintained list of things that must never appear in public: client names, project code names, and string shapes that look like API keys or tokens. Any hit is a hard refusal. This is the gate that matters most for a firm that does client work, because the most dangerous leak is not a deliberate one. It is a client's name surfacing in an offhand example because the drafter knew it from the work. De-identification has to be upstream and absolute: the digest the drafter reads is already stripped of client specifics, and the gate is the backstop that catches anything that slipped through.

Gate three, unresolved claims. The drafter is required to end every draft with a "claims to verify" list of every figure, price, and external fact it asserted. The publish step refuses while that list has open items. The effect is a forced function: a human has to confirm or cut each specific claim before the piece can ship. This is the gate aimed squarely at the CNET failure. It does not trust the model's confidence; it converts every confident specific into a checkbox a person has to clear.

A note on dollar amounts, because they are the common false alarm. Public prices and public statistics belong in a useful article. The confidentiality gate flags any dollar figure as a warning rather than a block, and a human confirms none of them is a client-specific rate. A warning that makes a person look is the right strength. A block on every number would train people to bypass the gate, which is worse than no gate.

Why the publish decision stays human.

The gates catch categories of error. They do not judge whether an argument is sound, whether the piece is worth your name, or whether a claim that passed the confidentiality grep is still something you would rather not say in public. None of that is mechanizable, and pretending otherwise is how unsupervised publishing pipelines embarrass the companies that run them.

So the model is speed everywhere the work is mechanical and a hard stop at the one place it is not. The signal that a piece is ready is not that it passed the audit. It is that a person read it, cleared the claims, and decided it was good enough to carry the company's name. The audit just guarantees that decision is the only thing standing between a draft and the public page, instead of being one of a dozen things a tired editor might forget on a Friday.

What an operator can do this week.

You do not need a platform to run this. You need a folder, two short scripts, and a rule you keep.

What this does not promise.

The gates are guardrails, not a guarantee of truth. A voice checker cannot tell you the argument is good. A confidentiality search catches a client's name spelled out; it will not catch a secret you paraphrased into a story. The claims gate only works if the drafter lists its claims honestly, which means the list itself is a thing a human has to read critically, not just count. The pipeline reduces the surface area a person has to watch. It does not remove the person.

What it does buy is the thing that makes AI-assisted publishing safe to run at all: a fast drafting loop that cannot reach the public without clearing a fixed set of mechanical checks and one human yes. You get the speed of letting a model write, and you keep the accountability of a person signing off. That is the trade most companies actually want, and the one the unsupervised version quietly gives up.

How Rarefied Earth runs this.

This article shipped through the pipeline it describes. The drafter wrote a v0, the voice and confidentiality gates ran on it, the claims were confirmed against the public sources below, and a person made the call to publish. Nothing on this site auto-publishes. The firm runs the loop on its own writing before it is ever offered to a client, which is the firm's standing rule for any tool it builds: it is the first user, and a capability earns its place by working on Rarefied Earth's own operation first. This publishing pipeline is one piece of that operating substrate, not a product on a shelf.

Sources and further reading.

Public references

  • CNET's AI-written articles · Futurism's original reporting on the factual errors, and The Decoder's summary of the review that corrected 41 of 77 articles and led to a formal AI policy. Futurism · The Decoder
  • Vectara Hallucination Leaderboard · Public, continuously updated measurement of how faithfully leading models summarize a provided source; the basis for the 1 to 3 percent figure on grounded summarization. vectara/hallucination-leaderboard
  • Stanford on legal hallucinations · Stanford HAI and RegLab on how pervasive hallucination is when general-purpose models answer specific legal questions. Stanford HAI

Related work.

This is one module of the operating substrate the firm argues every company now needs. The field guide on why most company AI projects stall makes the case for the substrate as a whole, and research once, harvest three ways covers the operating model this publishing loop is the last step of.

Discussion

Disagree, or running into this at your company? Reply by email: joseph.scott@rarefied.earth.


← Back to home Start a conversation