ARDS: An Agent-Ready Documentation Standard for Multi-Platform AI Coding Agent Context

K. Brady Davis CloudSurf Software LLC, Las Vegas, NV, USA brady@cloudsurf.com


Abstract

AI coding agents require project-level context to operate effectively, yet at least seven major tools define incompatible formats, forcing developers to maintain parallel files with overlapping content. We present the Agent-Ready Documentation Standard (ARDS) v3.0, an open specification establishing a canonical .context/ directory with write-once, generate-many architecture from which platform-specific context files are produced automatically. A single-developer production deployment across 15 repositories with 54 agent definitions suggests ARDS may roughly halve per-turn context cost, substantially improve context discovery rates, and reduce multi-machine handoff recovery from minutes to under a minute.


Hypothesis

A canonical, hierarchical documentation architecture (ARDS) from which platform-specific context files are generated enables AI coding agents to locate project context more efficiently than independently maintained, single-file formats.


Background

As of early 2026, at least seven major AI coding platforms --- Claude Code [1], Codex CLI [2], Cursor [3], GitHub Copilot [4], Windsurf [5], Gemini CLI [6], and llms.txt [7] --- each define their own format for project-level agent context. A project supporting multiple tools must maintain parallel files with overlapping content; updates to one must be manually propagated to others, leading to context drift. The most adopted format, AGENTS.md [2], now stewarded by the Agentic AI Foundation (AAIF) under the Linux Foundation [8], addresses only root context --- a single file of project-level instructions. It does not specify agent definitions, knowledge hierarchies, plan taxonomies, session checkpoints, or evidence epistemology. The Model Context Protocol (MCP) [9] defines what agents can do (tool access) but not what agents should know (documentation architecture). Empirical studies confirm this landscape is consequential: Robbes et al. [10] find 15.85--22.60% of GitHub projects show evidence of coding agent use, and Li et al. [11] identify 932,791 agent-authored pull requests across 116,211 repositories, establishing that agentic contribution is a structural feature of open-source development. Recent analyses of context files in the wild --- spanning over 3,200 files across Claude Code, Codex, Copilot, and Cursor [12, 13, 14, 15] --- find that developers overwhelmingly specify functional context while rarely including non-functional requirements, and that content patterns vary by project type, motivating structured hierarchies over rigid templates. Critically, Lulla et al. [16] measured a 28.64% median runtime reduction from AGENTS.md alone, while Gloaguen et al. [17] found that bloated or redundant context files increased inference cost by 20--23%, establishing that structured minimalism --- not maximal context --- drives agent efficiency.


Design

ARDS v3.0 defines a canonical .context/ directory containing seven document types organized in a layered file hierarchy: root context (CONTEXT.md, loaded every agent turn), agent definitions (.context/agents/), skill configurations (.context/skills/), knowledge documents (.context/docs/), living guides (.context/guides/), time-stamped plan documents (plans/), and research artifacts (research/). A machine-readable configuration (surfcontext.json) declares platform targets, discovery order, IP safety rules, and MCP integration endpoints. Platform-specific files (CLAUDE.md, AGENTS.md, .cursor/rules/) are generated from the canonical source via symlinks, sed transforms, or template copies --- establishing write-once, generate-many architecture. Version 3.0 introduces nine capabilities over v2.0: (1) living guides with confidence levels; (2) session checkpoints for work state preservation across context boundaries; (3) evidence epistemology with four-tier claim hierarchy and bias ledger; (4) IP safety with pre-write verification; (5) context budget management with progressive disclosure; (6) multi-agent coordination via task queues; (7) cross-repository references via hub-and-spoke model; (8) formalized, configurable discovery order; and (9) MCP integration mapping .context/ content to standard tool names. Nine design principles govern the specification, including token budget awareness (root context under 200 lines), tables over prose, single source of truth, and progressive disclosure. A quality scoring rubric (max 87 points for product repos, 99 for command-center repos across five dimensions) and freshness monitoring thresholds (7--60 days by document type) provide compliance measurement.

Evaluation protocol. We report an experience report (not controlled experiment) from a single-developer deployment. Quantitative estimates were reconstructed retrospectively from artifact analysis --- git history, plan documents, checkpoint files --- not from instrumented measurements. No control group exists. The developer is the standard's creator, introducing Hawthorne effects: reported gains represent best-case performance from a maximally knowledgeable user, not expected values for naive adopters.


Results

ARDS has been in daily production use since January 15, 2026 at CloudSurf Software LLC. The deployment comprises 15 repositories, 54 agent definitions (26 strategy, 28 product), 40 skill definitions, 1,052 plan documents across 38 categories, 209 session checkpoints, and 25 patent families --- managed by a single developer working with Claude Code (primary) and Codex CLI (secondary) across two machines. Quantitative estimates, reconstructed from artifact analysis rather than instrumented measurement, suggest three areas of improvement. First, a 200-line root context costs approximately 800 tokens per turn versus 1,600--2,400+ tokens for ad-hoc files observed before adoption, yielding an estimated 50--67% reduction in per-turn context cost over a 30-turn session (approximately 24,000 vs. 48,000--72,000 tokens). Second, the key files table in CONTEXT.md improved the developer-estimated first-attempt context discovery rate --- the proportion of agent turns where the agent located the needed document without additional search or clarifying questions --- from roughly half to near-universal. Third, session checkpoints reduced multi-machine handoff recovery from an estimated 5--10 minutes to approximately 30 seconds. The strategy repository's self-assessed quality score progressed from 45 (Fair) to 82 (Good) over four weeks. In the feature comparison matrix, ARDS addresses all 14 evaluated capabilities. We note that several --- evidence epistemology, IP safety, quality scoring --- are novel ARDS contributions that no competing format attempts; their inclusion favors ARDS by design. Among the eight capabilities that existing formats do address (root context, agent definitions, skills, knowledge hierarchies, token budget guidance, discovery order, machine-readable config, multi-platform generation), only Claude Code partially covers three.


Analysis

The production data supports --- but does not confirm --- the hypothesis. Three lines of evidence converge: (1) the structured hierarchy reduced per-turn token cost below what single-file formats achieve, consistent with Gloaguen et al.'s [17] finding that redundant context increases agent reasoning cost; (2) the formalized discovery order improved context hit rates, consistent with Cleland-Huang et al.'s [18] identification of traceability as a persistent challenge in software engineering; and (3) session checkpoints solved the multi-machine continuity problem that no existing format addresses. The token reduction estimate is directionally consistent with Lulla et al.'s [16] controlled measurement of 16.58% savings from a single AGENTS.md file, though extrapolating from single-file savings to a full hierarchy assumes that progressive disclosure successfully prevents agents from loading unnecessary context --- an assumption that Gloaguen et al.'s [17] finding of 20--23% overhead from bloated context files makes non-trivial. McCain et al. [19] found that the 99.9th-percentile Claude Code session nearly doubled in duration between late 2025 and early 2026, with context degradation --- not runaway automation --- as the dominant failure mode, directly motivating ARDS's checkpoint and budget management mechanisms.

However, five threats constrain confidence. All data comes from a single developer who simultaneously created and evaluated the standard (Hawthorne effect). Quantitative estimates are retrospective, not instrumented. Developer maturation confounds ARDS-specific gains. "Agent effectiveness" lacks formal definition. And the specification's complexity (seven document types, nine capabilities) may be overengineered for single-tool projects --- a limitation ARDS explicitly acknowledges. ARDS is most valuable for multi-tool, multi-agent, or cross-repository workflows; for single-tool projects, a plain AGENTS.md may suffice. A controlled crossover study measuring task completion time, agent turns, and context hit rate across structured versus unstructured repositories is needed to isolate the ARDS effect.


Conclusion

ARDS v3.0 addresses the fragmentation caused by seven incompatible AI coding agent context formats through a canonical .context/ directory with write-once, generate-many architecture. Production experience across 15 repositories suggests meaningful reductions in context cost and improved agent navigation, though single-developer validation limits generalizability. Future work includes CLI validation tooling (surf audit), a controlled crossover user study, and community governance if adoption extends beyond the originating organization. The specification is maintained at surfcontext.org.


References

[1] Anthropic, "Claude Code documentation: CLAUDE.md files," 2025--2026. [Online]. Available: https://claude.com/blog/using-claude-md-files

[2] OpenAI, "AGENTS.md -- a simple, open format for guiding coding agents," 2025. [Online]. Available: https://github.com/agentsmd/agents.md

[3] Cursor, Inc., "Rules -- Cursor documentation," 2025--2026. [Online]. Available: https://docs.cursor.com/context/rules

[4] GitHub, "Adding custom instructions for GitHub Copilot," 2025--2026. [Online]. Available: https://docs.github.com/en/copilot/customizing-copilot/adding-custom-instructions-for-github-copilot

[5] Windsurf (Codeium), "Cascade Memories," 2025--2026. [Online]. Available: https://docs.windsurf.com/windsurf/cascade/memories

[6] Google, "Provide context with GEMINI.md files -- Gemini CLI," 2026. [Online]. Available: https://geminicli.com/docs/cli/gemini-md/

[7] J. Howard, "The /llms.txt file," 2024. [Online]. Available: https://llmstxt.org/

[8] Linux Foundation, "Linux Foundation announces the formation of the Agentic AI Foundation (AAIF)," Dec. 2025. [Online]. Available: https://www.linuxfoundation.org/press/linux-foundation-announces-the-formation-of-the-agentic-ai-foundation

[9] Anthropic, "Model Context Protocol specification," Nov. 2024. [Online]. Available: https://modelcontextprotocol.io/specification/2024-11-05

[10] R. Robbes, T. Matricon, T. Degueule, A. Hora, and S. Zacchiroli, "Agentic much? Adoption of coding agents on GitHub," arXiv:2601.18341, Jan. 2026.

[11] H. Li, H. Zhang, and A. E. Hassan, "AIDev: Studying AI coding agents on GitHub," in Proc. 23rd Int. Conf. Mining Software Repositories (MSR), Apr. 2026.

[12] W. Chatlatanagulchai et al., "On the use of agentic coding manifests: An empirical study of Claude Code," in Proc. 26th Int. Conf. PROFES, Dec. 2025.

[13] W. Chatlatanagulchai et al., "Agent READMEs: An empirical study of context files for agentic coding," arXiv:2511.12884, Nov. 2025.

[14] H. V. F. Santos, V. Costa, J. E. Montandon, and M. T. Valente, "Decoding the configuration of AI coding agents: Insights from Claude Code projects," arXiv:2511.09268, Nov. 2025.

[15] S. Jiang and D. Nam, "Beyond the prompt: An empirical study of Cursor rules," arXiv:2512.18925, Dec. 2025.

[16] J. L. Lulla et al., "On the impact of AGENTS.md files on the efficiency of AI coding agents," arXiv:2601.20404, Jan. 2026.

[17] T. Gloaguen et al., "Evaluating AGENTS.md: Are repository-level context files helpful for coding agents?" arXiv:2602.11988, Feb. 2026.

[18] J. Cleland-Huang, O. C. Z. Gotel, J. Huffman Hayes, P. Mader, and A. Zisman, "Software traceability: Trends and future directions," in Proc. Future of Software Engineering (FOSE), 2014, pp. 55--69.

[19] M. McCain et al., "Measuring AI agent autonomy in practice," Anthropic, Feb. 2026. [Online]. Available: https://www.anthropic.com/research/measuring-agent-autonomy


K. Brady Davis is the founder of CloudSurf Software LLC. The ARDS specification is maintained at surfcontext.org.