Type: Case Study
Status: Published
Version: v0.2
Last synthesized: 2026-06-10
Reviewed by: AI-drafted; human content-review pending
Open tensions: 3
Four readers receive the same draft at the same minute. None of them sees the others' margins. One reformulates the argument; one audits the document line by line; one is told to attack the reading the first produced; the last is asked, at the end, to explain to the author why the verdict came out the way it did. Three of the four are language models. The point of the arrangement is that they never agree quietly.
This page documents a method with a published spine. Its formal name is Oblique Peer Review (OPR), set out in Fabrizio Terzi's Oblique Peer Review (Zenodo, CC BY 4.0); its working name in the project is CRT×4; its architecture is the 1+N pattern — one human orchestrator and N mutually-blind AI agents. It is not vapour: the OPR paper reports a real single-case deployment inside Obliqo, with a documented finding (below). What does not yet exist is the standalone open-source CRT×4 tool — the method is published and deployed; the general-purpose instrument is still to be built. We mark that seam rather than paper over it.
OPR / CRT×4 is an open-source method, developed under Pyragogy as independent research, for running a peer review of a text through several independent AI readers and a human conductor. The chain it sits in is explicit:
1+N: one human orchestrator (the 1) and N mutually-blind agents, where "critical tension is produced by the independence of the agents' analytical axes, not by opposition between their positions — the agents never communicate";N = 4 instance; its working name expands to Cognitive Rhythm Theory × 4 agents, and the four agents are the Critical Reading Team;You will not find "CRT×4" in the peer-review literature; you will find OPR and its 1+N formalisation in the project's own published paper, which is where the method's claims are argued and which this page links rather than restates.
The thing being reviewed, in this case study, is a document: an argument, a chapter, a policy draft, a paper. What the method adds to an ordinary human read-through is a Critical Reading Team — a small set of models that read the same draft under one rule that does most of the work: they read it independently, and their readings are never merged into a single smooth verdict.
It is worth being precise about what that excludes. CRT×4 is not "a wrapper that calls several models and averages them." Averaging is exactly the failure it refuses. It is not an intelligent tutoring system grading the text against a rubric, and it is not a single assistant asked to "review this and be critical" — that produces one voice performing critique, which is a mirror with a frown, not a team. The value sits in the friction that survives between readers who cannot see each other: different models, different training, different blind spots, no forced convergence.
The method's source specification describes a nine-stage workflow. Its native domain is code review — the spec was written by a developer auditing a repository — and the author flags, in the document itself, that generalizing it to a text means rewriting one stage: where the code version runs a forensic analysis of a repository, the document version runs a documentary analysis of the draft. That substitution is a design choice the project marks as deliberate, not an assumption to wave through.
Rendered for an open review of a text, the stages run roughly as follows.
The source is blunt about which stage carries the weight: "Stage 9 is the heart. Without it, this is automation. With it, it is learning." The first eight stages would, on their own, produce a corrected document and a passive author — the machine resolves, the human receives. Stage nine is the move that keeps the human as conductor rather than spectator: the learning loop that returns the reasoning, so the next draft is written by someone who now sees what the team saw.
The method's specification carries a single non-negotiable, and it is the thing most likely to be optimized away by anyone building the tool: the independence between models must be preserved. The agents do not read each other during the analysis — reciprocal blindness — and their outputs are never normalized toward a consensus. A model that runs "in its own house," on its own vendor's infrastructure, stays there rather than being routed through a layer that quietly aligns it with the others.
The reason is structural, not stylistic. An orchestrator that makes the answers converge kills the method, because uniform pressure flattens exactly the disagreement that was the asset. The danger has a name elsewhere in this handbook — Orchestra Desynchronization is its opposite failure, and the Compliance Trap is what a merged-to-consensus team collapses into: a panel of readers trained, by the merge step, to agree. Here the architecture has to defend friction against its own convenience.
This is the same instinct that runs through the sibling case study, Obliqo, where a set of reading agents are kept reciprocally blind by design. CRT×4 is, in the project's own framing, the method — open, citable, under Pyragogy — of which Obliqo is one commercial application. They are meant to reinforce each other: the product as commercial proof, the method as open methodological proof, the Pyragogy paper as the theoretical frame behind both.
The arrangement is not a novelty invented out of nothing; it sits next to an established practice. Open peer review is a real and adopted family of modifications to scholarly peer review — open identities, open reports, open participation — used by publishers including Nature, PLOS, and others. In the systematic review that mapped the field, Tony Ross-Hellauer found the term so unsettled that the literature held twenty-two distinct configurations of seven traits — "an umbrella term for a number of overlapping ways that peer review models can be adapted."
CRT×4 borrows the spirit of that openness — the report is visible, the reasoning is returned to the author — and adds the move open peer review does not make: some of the reviewers are synthetic, and the design's whole burden is to stop them from agreeing. Where human open review struggles to find enough independent reviewers, the synthetic team is cheap to convene; where it risks reviewers deferring to a senior name, the synthetic team risks the subtler deference of models trained toward the same agreeable center. The anchor is real. The twist is the project's, and we mark the seam rather than paper over it.
Honesty about the evidence is the point of a case page, so here is the ledger.
What is real: the method is specified in a dated project document; its nine stages, its independence rule, and its learning-loop stage are written down, not reconstructed here. The open-review practice it anchors to is verifiable and cited below. The decision to release the method as writing before building any tool — "validate the idea at near-zero cost; if it resonates, build; if not, archive without loss" — is recorded in the source.
What is not yet real, and must not be dressed as if it were:
Three things this method does not resolve, left open because the rest of the handbook is where they are worked.
The first is the asymmetry the team is built on. The adversarial reader attacks without anything at stake — it does not tire of the draft, does not fear the author, will not remember the exchange after the session resets. The friction lands on the human conductor and is real for them; it is not real for the attacker. That gap is a feature to be used knowingly, not a partnership to be sold — the handbook's account of functional asymmetry holds here unchanged.
The second is independence under doubt. The method asserts that four models with different training disagree usefully. How much genuine divergence survives when several frontier models share overlapping data and a common drift toward the agreeable center is an empirical question this page cannot answer without the missing field data. The rule is sound in principle; its yield is unmeasured.
The third is ownership of the verdict. When a reading is forged between a human conductor and a cluster of models that argued each other to a stop, whose review is it — and who answers for it if it is wrong? The handbook carries that question forward under The Epistemic Ownership Dilemma. Here it is enough to note that the learning loop in stage nine is what keeps the human able to answer for the verdict at all, rather than merely holding its output.
The method is written down. The tool is not. What the four readers find when they are finally let loose on a real draft, and whether their disagreement holds the way the specification promises, is the part this page cannot yet report — and will not pretend to.
1+N architecture, and the single-case Obliqo deployment with the over-firing convergence finding quoted above. Builds on the Cognitive Rhythm Theory (Zenodo 10.5281/zenodo.15480363) as its conceptual lens.pyragogy-crt4-opensource-method-2026-06-01.pdf. Source for the nine-stage workflow, the independence non-negotiable, the learning-loop stage, the method/application/paper positioning, and the build-after-launch decision. This document has no public URL at time of publication and cannot be independently verified by readers; it is cited here as an internal source pending a decision on public release form.↑ Back to Part IV — Cases and Experiments · Handbook · Home