Ant Engineer reverse-engineers Claude code source, revealing the four-layer decision pipeline mechanism of Auto Mode

Gate News, March 25 — Ant Group engineer and Umi.js front-end framework author Chen Cheng reverse-engineered the source code of Claude Code 2.1.81, fully restoring the decision mechanism of Auto Mode. The key finding: each tool invocation passes through four layers of decision-making, and only when the first three layers cannot determine the outcome will an independent AI classifier be called for safety review.

The four layers of the pipeline are: first, checking existing permission rules; if matched, allow directly; second, simulating acceptEdits mode (permission level allowing file edits). If it passes in this mode, it indicates low risk, and the classifier is skipped; third, checking the read-only tool whitelist (Read, Grep, Glob, LSP, WebSearch, etc.), which do not modify any state and are unconditionally allowed; only if none of these conditions are met does it proceed to the fourth layer, sending an API request to Claude Sonnet for safety classification.

Key design details of the classifier include: always using Sonnet instead of Opus, balancing cost and latency; setting temperature to 0 to ensure deterministic output; the classifier is defined as a “safety monitor for autonomous AI programming agents,” protecting against three types of risks (prompt injection, scope creep, unintended harm); the user’s CLAUDE.md configuration file is injected into the classifier context as the basis for judging user intent.

The interception rules cover over 22 categories, including force push, direct push to main branch, downloading and executing external code, production deployment, data leaks, self-modification permissions, creating remote code execution surfaces, credential leaks, etc. Exceptions for allowing include seven types: hardcoded test keys, local file operations within the working directory, read-only GET requests, installing declared dependencies, official toolchain installations, reading configuration credentials sent to target providers, and pushing to the current working branch.

The system also implements a circuit breaker mechanism: after three consecutive rejections or a total of 20 rejections, the system downgrades to manual confirmation; in headless mode, it directly aborts the agent. When the classifier is unavailable, a feature flag controls whether it “fail-closed” (immediately reject) or “fail-open” (downgrade to manual confirmation).

In Auto Mode, prompt injection behavior is finely controlled: injected once every five dialogue rounds, with the first injection in each five-round cycle being the full version (about 800 words, including six instructions such as “execute immediately,” “reduce interruptions,” “action over plan”), and the remaining four being a concise one-line version, balancing context window usage and behavioral stability.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

World Silver Institute: Global Silver Market to Face Sixth Consecutive Year of Supply Deficit

The World Silver Institute forecasts a widening supply shortage in the silver market, projecting a 46.3 million ounce deficit by 2026. While demand for silver bars and coins will rise, other sectors will decline, leading to an overall consumption decrease. Despite near-term challenges, a positive outlook for silver remains.

GateNews6h ago

Bitmine Immersion Technologies Reports $3.82B Quarterly Loss Despite Revenue Surge to $11M

Bitmine Immersion Technologies (BMNR) reported a $3.82 billion net loss for Q1 2026, primarily from unrealized digital asset losses. Despite this, it continues to accumulate Ethereum, now holding 4.87 million ETH worth $10.7 billion. Quarterly revenue increased to $11.04 million, mainly from staking rewards.

GateNews11h ago

Swiss Crypto Valley Sees 37% Growth in Blockchain Funding in 2025, TON Network Dominates

Switzerland's Crypto Valley received $728 million in blockchain funding in 2025, a 37% increase, dominating Europe's market. Key contributions included TON network's $400 million. Overall, global blockchain funding reached $15.5 billion, despite a drop in deal volume.

GateNews13h ago

TrendForce Cuts 2026 Server Shipment Forecast to 13% YoY Growth Amid Component Shortages

TrendForce has lowered its 2026 server shipment growth forecast to 13% from 20%, citing extended lead times for general-purpose server components as suppliers focus on high-margin AI servers, affecting general server delivery and market demand.

GateNews14h ago

Bitmine Quarterly Report: ETH Staking Income Grows 7x, but a Price Drop Turns into a $3.8 Billion Quarterly Loss

Bitmine Immersion Technologies’ 10-Q quarterly report, released on April 14, shows that as of February 28, 2026, although its revenue grew by 7 times to $11.04 million, it recorded unrealized losses of $3.78 billion due to a decline in the price of ETH, resulting in a net loss of $3.82 billion for the quarter. The company is shifting from traditional mining to an ETH treasury management strategy, emphasizing growth in staking income while also facing price volatility risk.

ChainNewsAbmedia15h ago

TRON Q1 2026 Protocol Revenue Reaches $82.69M, Ranking Second Across All Chains

Gate News message, TRON's protocol revenue reached $82.69M in Q1 2026, second only to Hyperliquid among all chains. At the same time, TRON's TVL reached $4.52B.

GateNews15h ago
Comment
0/400
No comments