Knowledge Artifacts: Structured Files for Dual Human-AI Consumption in AI-Assisted Software Organizations

K. Brady Davis CloudSurf Software LLC, Las Vegas, NV, USA brady@cloudsurf.com

Abstract

The adoption of AI coding agents has created a new category of software engineering artifact: structured knowledge files designed for simultaneous human and AI consumption. These "knowledge artifacts" --- including context files, agent configurations, deep context documents, decision records, and cross-system references --- have been studied individually in recent empirical work, but their collective role as an organizational knowledge system remains unexamined. This paper makes three contributions: (1) a taxonomy of five knowledge artifact types organized by function, (2) identification of four properties that distinguish knowledge artifacts from traditional documentation and configuration files --- dual-audience design, behavior-prescribing semantics, self-evolution through use, and multi-agent orchestration --- and (3) a deep case study of a multi-artifact ecosystem in a real software organization coordinating 8 repositories through 24 specialized AI agents. We find that the ecosystem-level view reveals coordination and organizational memory functions invisible in studies of individual files. We discuss implications for practitioners adopting AI agents, for tool builders, and for researchers studying AI-assisted software engineering.

Keywords: knowledge artifacts, AI coding agents, context engineering, organizational knowledge, software documentation, multi-agent systems

1. Introduction

AI coding agents --- tools such as Claude Code, GitHub Copilot, OpenAI Codex, and Cursor --- have moved from code completion to autonomous software engineering tasks: implementing features, fixing bugs, running tests, and creating pull requests. As these agents take on broader responsibilities, developers have adopted a new practice: writing structured Markdown files that configure agent behavior, encode project knowledge, and prescribe development conventions.

These files go by many names --- CLAUDE.md, AGENTS.md, .cursorrules, copilot-instructions.md --- and recent empirical work has begun to characterize them. Santos et al. analyzed 328 CLAUDE.md files to decode configuration patterns [Santos2025]. Chatlatanagulchai et al. studied 253 agentic coding manifests [Chatlatanagulchai2025a] and later expanded to 2,303 context files across multiple tools [Chatlatanagulchai2025b]. Mohsenimofidi et al. investigated context engineering practices in 466 open-source projects [Mohsenimofidi2025]. Most recently, Lulla et al. demonstrated that AGENTS.md files reduce AI agent runtime by 28.6% [Lulla2026].

These studies share a common limitation: they analyze individual files in isolation. A single CLAUDE.md is one node in what can be a much larger system of interconnected knowledge files --- agent configurations that define specialized personas, deep context documents that encode organizational knowledge, decision records that provide historical context, and cross-repository references that wire knowledge across codebases. No prior work examines how multiple knowledge artifact types form an organizational knowledge system, or what properties emerge at the ecosystem level that are invisible when studying individual files.

This paper addresses that gap with three contributions:

Taxonomy. We propose a taxonomy of five knowledge artifact types, organized by function: context files, agent configurations, deep context documents, decision records, and cross-system references (Section 3).
Properties. We identify four properties that collectively distinguish knowledge artifacts from traditional documentation, configuration files, and source code: dual-audience design, behavior-prescribing semantics, self-evolution through use, and multi-agent orchestration (Section 4).
Case study. We present a deep case study of a knowledge artifact ecosystem in a real software organization that coordinates 8 repositories through 24 specialized AI agents and 121 structured knowledge files (Section 5).

Our work draws on five research strands: context files for AI agents, architecture decision records, documentation as code, organizational memory, and literate programming.

2.1 Context Files for AI Agents

The rapid adoption of AI coding agents has spawned a nascent but growing body of empirical research on how developers configure these tools. Chatlatanagulchai et al. [Chatlatanagulchai2025a] conducted the first empirical study of CLAUDE.md files, analyzing 253 files from 242 repositories and finding that manifests typically exhibit shallow hierarchies dominated by operational commands and implementation details. Their follow-up study [Chatlatanagulchai2025b] expanded to 2,303 context files across multiple AI tools, finding that these files are "not static documentation but complex, difficult-to-read artifacts that evolve like configuration code." Build commands (62.3%), implementation details (69.9%), and architecture descriptions (67.7%) dominate, while non-functional requirements such as security (14.5%) and performance (14.5%) are rarely specified.

Santos et al. [Santos2025] independently analyzed 328 CLAUDE.md files, identifying the software engineering concerns they address and how these concerns co-occur. Mohsenimofidi et al. [Mohsenimofidi2025] studied 466 open-source projects and found significant variation in how context is provided --- descriptive, prescriptive, prohibitive, explanatory, and conditional --- with no established structure.

The most recent work by Lulla et al. [Lulla2026] moves from characterization to impact measurement, demonstrating that AGENTS.md files are associated with 28.6% lower median runtime and 16.6% reduced token consumption on 124 pull requests across 10 repositories.

All five studies focus on individual context files. Our work extends this line of research by examining how multiple types of knowledge artifacts --- not just context files --- form an interconnected system within an organization.

2.2 Architecture Decision Records

Architecture Decision Records (ADRs), proposed by Nygard [Nygard2011], capture the context, decision, and consequences of architecturally significant choices. Keeling [Keeling2022] argued that ADRs have cultural impact beyond documentation: they help developers "evolve into architectural thinkers." Ahmeti et al. [Ahmeti2024] conducted action research on ADR adoption in practice.

ADRs are a spiritual ancestor of knowledge artifacts. Both are version-controlled, structured, and designed to preserve decision context. However, ADRs target a single audience (human architects) and a single concern (architectural decisions). Knowledge artifacts extend this pattern to dual audiences and to organizational concerns beyond architecture --- strategy, personnel, legal structure, and product portfolio.

2.3 Documentation as Code

The documentation-as-code movement treats documentation as a first-class engineering artifact: version-controlled, reviewed, tested, and deployed alongside source code. Cadavid et al. [Cadavid2023] demonstrated that this philosophy improves interface management in systems of systems. Aghajani et al. [Aghajani2019] cataloged documentation issues in software projects, finding that documentation decay --- the divergence between documentation and the system it describes --- is pervasive.

Knowledge artifacts inherit the version-control and co-location principles of docs-as-code, but add a crucial difference: they are consumed by software agents as part of the development workflow, not merely read by humans. This creates feedback loops that traditional documentation lacks (Section 4, P3).

2.4 Organizational Memory

Walsh and Ungson [Walsh1991] defined organizational memory as stored information from an organization's history that can be brought to bear on present decisions. They identified five retention facilities: individual memory, culture, transformations, structures, and ecology. Bjornson and Dingsoyr [Bjornson2008] systematically reviewed knowledge management in software engineering, finding that most approaches rely on tacit knowledge transfer rather than explicit codification.

Knowledge artifacts represent a new retention facility --- explicit, version-controlled, and machine-readable organizational knowledge that AI agents can retrieve and act upon. Unlike wikis or knowledge bases, these artifacts are embedded in the development workflow and co-located with the code they describe.

2.5 Literate Programming

Knuth's literate programming [Knuth1984] proposed interleaving code and natural-language explanation in a single document, designed for human reading order rather than compiler order. Parnas [Parnas2011] later argued for "precise documentation" that serves as both specification and explanation.

Knowledge artifacts echo literate programming's vision of documents that serve both human comprehension and machine execution. The difference is audience: literate programs combine code and explanation for a human reader, while knowledge artifacts combine instructions and context for both human and AI readers simultaneously.

3. Taxonomy of Knowledge Artifacts

We define a knowledge artifact as a structured, version-controlled file designed for dual human-AI consumption that encodes organizational knowledge, prescribes agent behavior, or coordinates work across systems. Based on analysis of the public datasets in prior work [Santos2025, Chatlatanagulchai2025b, Mohsenimofidi2025] and practitioner experience, we propose a taxonomy of five types organized by primary function.

Type	Purpose	Audience	Update Freq.	Size (lines)	Examples
1. Context Files	Orient AI agents to project identity, conventions, architecture	Both	Weekly--monthly	50--200	`CLAUDE.md`, `AGENTS.md`, `.cursorrules`
2. Agent Configs	Define specialized AI agent personas with scoped roles	Both	Monthly--quarterly	62--1,376	`.claude/agents/researcher.md`
3. Deep Context Docs	Encode domain-specific organizational knowledge	Both	Weekly--quarterly	100--300	`.claude/docs/product-portfolio.md`
4. Decision Records	Capture timestamped analyses and strategic decisions	Both	Event-driven	50--1,000+	`plans/capital/funding-strategy.md`
5. Cross-System Refs	Wire knowledge across repository boundaries	Both	Structural changes	10--50 (embedded)	`.claude/docs/repo-registry.md`

3.1 Type 1: Context Files

Root-level files that orient AI agents to a project: CLAUDE.md, AGENTS.md, .cursorrules, copilot-instructions.md. These provide project identity, conventions, architecture overview, and development commands. They are read by AI agents at session start and shape all subsequent interactions. Prior empirical work [Santos2025, Chatlatanagulchai2025a, Chatlatanagulchai2025b, Mohsenimofidi2025, Lulla2026] has focused almost exclusively on this type.

Audience: Both (human developers scanning for conventions; AI agents loading project context). Update frequency: Weekly to monthly. Typical size: 50--200 lines. Example: A root CLAUDE.md that defines 8 key file paths, links to deep context documents, references 24 agents, and specifies slash commands.

3.2 Type 2: Agent Configurations

Files that define specialized AI agent personas with scoped responsibilities, tool access, and behavioral constraints. In Claude Code, these reside in .claude/agents/*.md and use YAML frontmatter to specify the agent's name, dispatch description, available tools, and model selection. The body defines scope, working procedures, output format, and guiding principles.

Agent configurations enable multi-agent orchestration: different tasks are dispatched to different agents based on their description field, creating a division of labor expressed in natural language rather than code.

Audience: Both (developers designing the agent system; AI dispatcher selecting agents). Update frequency: Monthly to quarterly. Typical size: 62--298 lines (median ~128), with outliers exceeding 1,000 lines for domain-heavy agents such as patent law. Example: A researcher.md agent configuration that scopes the agent to academic paper drafting, specifies opus model selection, lists 7 tools, and defines 5 working principles.

3.3 Type 3: Deep Context Documents

Structured reference documents that encode organizational knowledge for both human and AI consumption. These go beyond what fits in a root context file, providing comprehensive information on specific domains: company overview, product portfolio, legal structure, financial framework, technical standards, repository registry.

Deep context documents are linked from the root context file (Type 1) but not loaded automatically --- agents read them on demand when a task requires domain-specific knowledge. This creates a two-tier information architecture: concise context in the root file, depth in linked documents.

Audience: Both (humans reviewing organizational state; AI agents retrieving domain knowledge). Update frequency: Weekly (actively evolving domains) to quarterly (stable domains). Typical size: 100--300 lines. Example: A product-portfolio.md document listing 7 products with their stage, technology stack, pricing, competitive landscape, and strategic positioning.

3.4 Type 4: Decision Records

Timestamped strategic analyses and decisions stored as structured Markdown files. These extend ADRs [Nygard2011] from architectural decisions to organizational decisions: funding strategy, patent filing timing, product launch sequencing, competitive positioning, hiring plans.

Each record captures the analysis date, the options considered, the decision made, the rationale, and the next actions. When an AI agent is asked to make a recommendation, it can retrieve relevant decision records to understand prior choices and avoid contradicting established strategy.

Audience: Both (founder reviewing decision history; AI agents grounding recommendations in prior decisions). Update frequency: Created as decisions arise (event-driven). Typical size: 50--1,000+ lines (varies widely; strategic analyses are concise, patent drafts are extensive). Example: A 2026-01-30-funding-strategy-analysis.md record analyzing 12-month funding paths with quarterly revenue projections, instrument comparisons, and cap table scenarios.

3.5 Type 5: Cross-System References

Files and file sections that wire knowledge across repository boundaries. A strategy repository's root context file references product repositories by relative path. Agent configurations specify which repositories they can read. Deep context documents maintain a registry of all repositories with their paths, technology stacks, and agent counts.

Cross-system references create an inter-repository knowledge graph that allows AI agents to navigate organizational structure, access relevant context across codebases, and maintain consistency across products.

Audience: Both (developers navigating multi-repo projects; AI agents resolving cross-repo references). Update frequency: When repository structure changes. Typical size: Embedded in other artifact types (10--50 lines of cross-references per file). Example: A repo-registry.md document mapping 10 repositories to their filesystem paths, technology stacks, deployment targets, and agent counts, with explicit cross-reference conventions ("strategy reads product state, not product code").

4. Four Distinguishing Properties

We argue that knowledge artifacts are a distinct category --- not traditional documentation, not configuration files, not source code --- because they exhibit four properties simultaneously. Individual artifact types in other domains may exhibit one or two of these properties, but the combination of all four in a single artifact class is, to our knowledge, novel.

P1: Dual-Audience Design

Knowledge artifacts are written for both human comprehension and AI agent consumption. Traditional documentation targets humans only; configuration files (.json, .yaml, .toml) target machines only. Knowledge artifacts must be simultaneously readable by humans scanning for context and parseable by AI agents extracting instructions.

This dual-audience constraint produces specific structural patterns observed in prior empirical work [Chatlatanagulchai2025b, Santos2025]: tables for quick lookup (humans scan rows; AI agents parse structured data), explicit section headers (humans navigate; AI agents locate relevant sections), imperative instructions mixed with explanatory context (AI agents execute; humans understand why), and Markdown formatting that is both human-readable and machine-parseable.

Mohsenimofidi et al. [Mohsenimofidi2025] found five distinct rhetorical modes in context files --- descriptive, prescriptive, prohibitive, explanatory, and conditional --- reflecting the dual demand to both explain (for humans) and instruct (for AI agents) within a single artifact.

P2: Behavior-Prescribing

Knowledge artifacts do not merely describe --- they prescribe agent behavior. A context file defines how the AI agent should interact with a codebase. An agent configuration specifies what the agent should do, what tools it can use, what principles it follows, and what outputs it produces. This makes knowledge artifacts functionally closer to source code than to documentation: they determine system behavior rather than describing it after the fact.

Lulla et al. [Lulla2026] demonstrated this property empirically: the presence of an AGENTS.md file measurably changed agent behavior, reducing runtime by 28.6% and output tokens by 16.6%. The file did not describe the agent's behavior --- it shaped it.

This property extends beyond individual files. An agent configuration's description field determines which agent handles which user request. A deep context document's content shapes what recommendations an AI agent makes. A decision record constrains what strategies an agent proposes. The knowledge artifact ecosystem collectively defines the behavioral envelope of all AI agents operating within the organization.

P3: Self-Evolution

Traditional documentation decays because maintaining it is a separate activity from doing the work it describes [Aghajani2019, Lehman1980]. Knowledge artifacts resist this decay because they are part of the work loop itself.

When an AI agent makes a strategic recommendation, it reads decision records and deep context documents as input. When the recommendation is accepted, the agent updates those same artifacts to reflect the new decision. When a product ships a feature, the agent updates the product portfolio document. When a new repository is created, the agent updates the repository registry. The feedback loop between using artifacts and updating them is tight --- often occurring within the same agent session.

Chatlatanagulchai et al. [Chatlatanagulchai2025b] observed this property empirically, finding that context files "evolve like configuration code, maintained through frequent, small additions." We extend this observation: the self-evolution property is not limited to context files but applies across all knowledge artifact types, and it is structurally reinforced by the fact that AI agents both consume and produce these artifacts.

P4: Orchestrating

Knowledge artifacts coordinate multi-agent systems through natural language rather than APIs, message queues, or service meshes. This is a form of software architecture expressed in prose.

In multi-agent systems research, coordination typically relies on programmatic mechanisms: shared memory, message passing, or centralized planners [Hong2024]. Knowledge artifacts achieve coordination differently. An agent configuration's description field serves as a natural-language dispatch rule --- the AI runtime reads all available descriptions and routes tasks to the best-matching agent. Cross-system references create a navigable knowledge graph across repositories. Deep context documents establish shared organizational state that multiple agents read, ensuring consistency without explicit message passing.

This orchestration pattern creates a hub-and-spoke architecture: a central strategy repository serves as the knowledge hub, product repositories serve as spokes, and knowledge artifacts wire them together. The organizational structure is encoded in prose and enforced by AI agents that read and follow it.

5. Case Study: CloudSurf Software LLC

We present a deep case study of knowledge artifact usage in a real software organization, following established case study methodology [Runeson2009, Yin2018]. All quantitative data reflects a snapshot taken on January 30, 2026.

5.1 Context

CloudSurf Software LLC is a bootstrapped software startup developing multiple SaaS products across 8 repositories, coordinated by a single founder using AI coding agents (primarily Claude Code). The organization's strategy repository serves as a command center --- a hub containing no application code, only knowledge artifacts that coordinate all products, agents, and strategic decisions.

The founder intentionally designed the knowledge artifact ecosystem over a period of approximately three months (November 2025 -- January 2026), iterating on structure, conventions, and agent configurations as the organization's needs evolved. This is not a naturally occurring system observed by an external researcher; it is a deliberately engineered system reported by its designer.

5.2 Ecosystem Description

The strategy repository contains:

1 root context file (CLAUDE.md): 186 lines, linking to 8 deep context documents, referencing 24 agents, defining 6 slash commands, and mapping 8 product repositories.
24 agent configurations (.claude/agents/*.md): Ranging from 62 to 1,376 lines (avg ~194), defining scope, tools, behavioral rules, and output conventions. The long tail is driven by domain-heavy agents (e.g., the patent-law agent at 1,376 lines embeds USPTO formatting rules, claim-writing templates, and prior-art search procedures). Agents span business functions: legal, capital, revenue, product strategy, competitive intelligence, brand, talent, DevOps, customer success, research, risk analysis, sales, marketing, and patent law.
8 deep context documents (.claude/docs/*.md): Company overview, product portfolio, legal plan, financial framework, agent standards, repository registry, business principles, and HQ architecture.
85 decision records (plans/*.md): Organized into 19 categories (capital, competitive, infrastructure, patents, people, product, strategy, legal, marketing, audit, dev, ideas, books, brand, fact-check, research, revenue, risk, workspace). Each captures analysis, options, recommendations, and next actions.
Cross-system references: The root context file and repository registry map all 8 product repositories by filesystem path, with explicit conventions for cross-repo access ("strategy reads product state, not product code").

Table: Knowledge Artifact Ecosystem --- Quantitative Summary (snapshot: January 30, 2026)

Artifact Type	Count	Avg. Lines	Total Lines
Context files	1	186	186
Agent configurations	24	~194	4,644
Deep context docs	8	~207	1,653
Decision records	85	~441	~37,509
Cross-system refs	3	~60	~180
Total	121	---	~44,172

5.3 Observations by Property

P1: Dual-Audience Design

The root CLAUDE.md exemplifies dual-audience design. It uses tables for structured data (key files, agent list, repository map) because tables serve both audiences: the founder scans rows to locate a plan reference, while the AI agent parses table cells to resolve file paths. Section headers use action-oriented labels ("Your Expertise," "Active Work," "Commands") that serve as both human navigation and AI section-matching.

The agent standards document (.claude/docs/agent-standards.md, 283 lines) encodes formatting rules designed explicitly for dual consumption: "Keep CLAUDE.md under 200 lines" (respects AI context budgets); "Use tables for structured data" (Claude parses tables faster than prose); "Include a key files table" (saves Claude from globbing/grepping to find things). These metalevel instructions --- rules about how to write knowledge artifacts --- demonstrate awareness of the dual audience at the design level.

P2: Behavior-Prescribing

Each agent configuration's YAML frontmatter contains a description field that functions as a natural-language dispatch rule. For example:

description: Draft, revise, and strengthen academic research papers --- structure arguments, ground claims in evidence, design studies, manage references, and prepare manuscripts for peer review.

description: Product roadmap, competitive positioning, feature prioritization, and go-to-market strategy across the Surf Suite.

When the founder asks a question, the AI runtime matches the query against all 24 descriptions and dispatches to the best-matching agent. The description does not describe what the agent does --- it determines what the agent will do. Changing a description changes the system's behavior without modifying any application code.

Beyond dispatch, agent bodies prescribe working procedures. The researcher agent specifies: "Ground every claim in a citation or flag it explicitly as a hypothesis/assumption. Use [hypothesis] or [needs citation] inline markers." The risk analyst agent requires: "Every recommendation that costs time or money must answer: Is this the highest-return use of this resource?" These instructions shape agent output as directly as a function signature shapes a function's interface.

P3: Self-Evolution

The knowledge artifact ecosystem at CloudSurf exhibits tight update loops. When the founder uses the product strategist agent to decide on TaskSurf pricing, the agent reads the current product portfolio document, analyzes options, recommends $6.99/seat/month, and --- upon acceptance --- updates the product portfolio document to reflect the new pricing. The decision record captures the analysis; the deep context document reflects the outcome. Both artifacts evolve as a byproduct of the decision-making work.

The root CLAUDE.md contains an "Active Work" section that tracks current state: what is deployed, what is in progress, what is blocked. This section is updated by agents as work progresses, not as a separate documentation task. The Active Work section in the studied system lists 11 open decisions and 10+ completed decisions, each linked to relevant plan files.

Over the three-month observation period, the 85 decision records were created at an average rate of approximately 7 per week, indicating that knowledge artifact creation is integrated into the daily workflow rather than treated as a periodic documentation exercise.

P4: Orchestrating

The strategy repository functions as a hub in a hub-and-spoke architecture. The root context file maps 8 product repositories by filesystem path. Agent configurations reference these paths to scope their access. The cross-reference conventions document specifies directional rules: "Strategy -> Product: strategy agents can read product repo state" and "Product -> Strategy: product repos reference strategy for business context."

This creates a navigable organizational knowledge graph. When the capital strategy agent analyzes funding readiness, it reads the product portfolio (deep context), recent revenue projections (decision records), and product deployment status (cross-repo references to product CLAUDE.md files). No programmatic integration is required --- the agent follows natural-language references to locate and read the relevant artifacts.

The 24 agents collectively cover the organization's functional areas: legal, finance, product, engineering, marketing, sales, research, and operations. Their coordination is implicit --- mediated by shared access to the same knowledge artifacts --- rather than explicit through APIs or message passing. When the product strategist agent recommends a launch plan, the growth marketing agent can read that plan and propose a content strategy; the risk analyst can read both and identify threats. The knowledge artifacts serve as the shared state layer.

5.4 Conflict of Interest Disclosure

The author is the founder of CloudSurf and designer of the knowledge artifact ecosystem studied. This case study represents a practitioner's self-report of an intentionally designed system, not an independent observation of a naturally occurring phenomenon. The taxonomy (Section 3) draws on public datasets from prior work [Santos2025, Chatlatanagulchai2025b, Mohsenimofidi2025] to partially mitigate this limitation, but the case study itself cannot be considered independent.

6. Discussion

6.1 Implications for Practice

Teams adopting AI coding agents typically start with a single context file --- a CLAUDE.md or AGENTS.md --- and stop there. Our taxonomy and case study suggest that the ecosystem-level benefits of knowledge artifacts --- multi-agent coordination, organizational memory that AI agents can access, self-evolving documentation --- emerge only when multiple artifact types work together.

Practitioners should consider three progressive levels of adoption: (1) a root context file for project orientation, (2) agent configurations for task specialization, and (3) deep context documents and decision records for organizational knowledge. The marginal cost of each level is low --- these are Markdown files, not infrastructure --- but the compound benefit grows nonlinearly as AI agents can cross-reference more of the organization's knowledge.

The SPACE framework [Forsgren2021] identifies five dimensions of developer productivity: satisfaction, performance, activity, communication, and efficiency. Knowledge artifacts primarily target the efficiency and communication dimensions by reducing the context-loading cost for AI agents and encoding shared understanding. The 28.6% runtime reduction observed by Lulla et al. [Lulla2026] for a single file type suggests that a full ecosystem could yield substantially larger productivity gains, though this claim requires empirical validation.

6.2 Implications for Tools

Current AI coding tools consume context files but provide limited support for creating or maintaining them. Based on our taxonomy and case study, we identify four tool opportunities:

Scaffolding: Tools that generate initial knowledge artifacts from existing project structure, README files, and configuration.
Validation: Linters that check cross-references between artifacts, flag stale information, and verify that agent configurations reference accessible files.
Visualization: Graph views of the knowledge artifact ecosystem --- which agents read which documents, which repositories reference which others.
Migration: Converters between tool-specific formats (CLAUDE.md <-> AGENTS.md <-> .cursorrules).

6.3 Implications for Research

Knowledge artifacts create several research questions that existing studies have not yet addressed:

Evolution. How do knowledge artifacts evolve over time? Do they follow patterns similar to Lehman's laws of software evolution [Lehman1980], or do they exhibit distinct evolutionary dynamics due to their dual-audience nature?

Effectiveness. What structures are most effective for dual-audience consumption? Chatlatanagulchai et al. [Chatlatanagulchai2025b] found that developers prioritize functional context, but it remains unknown whether this priority aligns with what AI agents most benefit from.

Knowledge transfer. Do knowledge artifacts improve team onboarding? In principle, a well-maintained artifact ecosystem encodes organizational knowledge that new team members and new AI agent sessions can immediately access. Empirical validation is needed.

Multi-developer teams. Our case study examines a single-developer organization. How do knowledge artifacts function when multiple developers maintain them? Do they create coordination benefits (shared understanding) or coordination costs (merge conflicts, inconsistent updates)?

AI-generated artifacts. As AI agents become more capable, knowledge artifacts may increasingly be authored by AI rather than humans. This raises questions about quality, consistency, and the feedback loop between artifact consumers and artifact producers.

6.4 Limitations

This work has three primary limitations. First, the taxonomy is proposed based on one organization's practice and prior literature, not validated through systematic analysis of multiple organizations. Second, the case study is a single organization, operated by a single founder, who is also the paper's author --- a maximal conflict of interest that readers should weigh when evaluating the case study's findings. Third, the organization is small and early-stage; the patterns observed may not scale to larger teams, longer time horizons, or organizations with less deliberate knowledge management practices.

6.5 Threats to Validity

Construct validity. Are knowledge artifacts genuinely "new," or are they simply a rebranding of existing artifact types (documentation, configuration, decision records)? We argue that the combination of four properties --- dual-audience design, behavior-prescribing semantics, self-evolution, and orchestration --- is novel, even though individual properties appear in prior artifact types. However, the novelty claim rests on the combination, which future empirical work should test.

Internal validity. The case study examines a system deliberately designed by the paper's author. This self-selection bias means we are studying a well-functioning ecosystem, not a representative one. Knowledge artifact ecosystems that evolved organically or were poorly maintained would likely exhibit different properties. The public data from prior work [Santos2025, Chatlatanagulchai2025b, Mohsenimofidi2025] provides some grounding, but the taxonomy has not been validated against those datasets.

External validity. A solo-founder, AI-agent-heavy workflow is not representative of most software teams. The orchestration benefits (P4) may be unique to this context --- teams with multiple human developers may coordinate through meetings and chat rather than through knowledge artifacts. The self-evolution property (P3) depends on AI agents being the primary workflow actors, which is currently unusual. As AI agent adoption increases, external validity may improve, but at present, generalizability is limited.

7. Conclusion

The widespread adoption of AI coding agents has created a new category of software engineering artifact. Knowledge artifacts --- structured files designed for dual human-AI consumption --- go beyond the individual context files studied in prior work to form organizational knowledge systems with four distinguishing properties: dual-audience design, behavior-prescribing semantics, self-evolution through use, and multi-agent orchestration through natural language.

Our taxonomy identifies five knowledge artifact types (context files, agent configurations, deep context documents, decision records, and cross-system references), and our case study demonstrates how these types work together in a real organization to coordinate 8 repositories through 24 specialized AI agents. The ecosystem-level view reveals coordination and organizational memory functions that are invisible when studying individual files in isolation.

This work is a first step. The taxonomy requires validation across multiple organizations. The four properties require empirical measurement, not just qualitative observation. The case study is limited by its single-organization, single-author design. But the phenomenon is real and growing: as of January 2026, the AGENTS.md format alone has been adopted by over 60,000 repositories [Lulla2026]. Understanding how these artifacts function --- individually and as ecosystems --- is essential for both the practitioners who create them and the researchers who study AI-assisted software engineering.

Future work should pursue three directions: (1) empirical validation of the taxonomy across diverse organizations, (2) longitudinal studies of knowledge artifact evolution and decay, and (3) tool support for creating, validating, and visualizing knowledge artifact ecosystems. The 44,000+ lines of structured organizational knowledge in our case study represent one organization's approach; the field needs to understand what approaches work best, for whom, and under what conditions.

References

[Aghajani2019] E. Aghajani et al., "Software Documentation Issues Unveiled," Proc. 41st ICSE, IEEE, 2019, pp. 1199--1210, doi: 10.1109/ICSE.2019.00122.
[Ahmeti2024] B. Ahmeti et al., "Architecture Decision Records in Practice: An Action Research Study," Proc. ECSA 2024, Springer.
[Bjornson2008] F. O. Bjornson and T. Dingsoyr, "Knowledge Management in Software Engineering: A Systematic Review," Information and Software Technology, vol. 50, no. 11, pp. 1055--1068, 2008.
[Cadavid2023] H. Cadavid et al., "Improving Hardware/Software Interface Management in Systems of Systems Through Documentation as Code," Empirical Software Engineering, vol. 28, no. 4, p. 100, 2023.
[Chatlatanagulchai2025a] W. Chatlatanagulchai et al., "On the Use of Agentic Coding Manifests: An Empirical Study of Claude Code," arXiv:2509.14744, 2025.
[Chatlatanagulchai2025b] W. Chatlatanagulchai et al., "Agent READMEs: An Empirical Study of Context Files for Agentic Coding," arXiv:2511.12884, 2025.
[Forsgren2021] N. Forsgren et al., "The SPACE of Developer Productivity," ACM Queue, vol. 19, no. 1, pp. 20--48, 2021.
[Hong2024] S. Hong et al., "MetaGPT: Meta Programming for a Multi-Agent Collaborative Framework," arXiv:2308.00352, 2023.
[Keeling2022] M. Keeling, "The Psychology of Architecture Decision Records," IEEE Software, vol. 39, no. 6, pp. 114--117, 2022.
[Knuth1984] D. E. Knuth, "Literate Programming," The Computer Journal, vol. 27, no. 2, pp. 97--111, 1984.
[Lehman1980] M. M. Lehman, "Programs, Life Cycles, and Laws of Software Evolution," Proceedings of the IEEE, vol. 68, no. 9, pp. 1060--1076, 1980.
[Lulla2026] J. L. Lulla et al., "On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents," arXiv:2601.20404, 2026.
[Mohsenimofidi2025] S. Mohsenimofidi et al., "Context Engineering for AI Agents in Open-Source Software," arXiv:2510.21413, 2025.
[Nygard2011] M. Nygard, "Documenting Architecture Decisions," Blog post, Cognitect Inc., 2011.
[Parnas2011] D. L. Parnas, "Precise Documentation: The Key to Better Software," The Future of Software Engineering, Springer, 2011.
[Runeson2009] P. Runeson and M. Host, "Guidelines for Conducting and Reporting Case Study Research in Software Engineering," Empirical Software Engineering, vol. 14, no. 2, pp. 131--164, 2009.
[Santos2025] H. V. F. Santos et al., "Decoding the Configuration of AI Coding Agents: Insights from Claude Code Projects," arXiv:2511.09268, 2025.
[Walsh1991] J. P. Walsh and G. R. Ungson, "Organizational Memory," Academy of Management Review, vol. 16, no. 1, pp. 57--91, 1991.
[Yin2018] R. K. Yin, Case Study Research and Applications: Design and Methods, 6th ed., SAGE Publications, 2018.

Knowledge Artifacts: Structured Files for Dual Human-AI Consumption in AI-Assisted Software Organizations

Abstract

1. Introduction

2. Background and Related Work

2.1 Context Files for AI Agents

2.2 Architecture Decision Records

2.3 Documentation as Code

2.4 Organizational Memory

2.5 Literate Programming

3. Taxonomy of Knowledge Artifacts

3.1 Type 1: Context Files

3.2 Type 2: Agent Configurations

3.3 Type 3: Deep Context Documents

3.4 Type 4: Decision Records

3.5 Type 5: Cross-System References

4. Four Distinguishing Properties

P1: Dual-Audience Design

P2: Behavior-Prescribing

P3: Self-Evolution

P4: Orchestrating

5. Case Study: CloudSurf Software LLC

5.1 Context

5.2 Ecosystem Description

5.3 Observations by Property

P1: Dual-Audience Design

P2: Behavior-Prescribing

P3: Self-Evolution

P4: Orchestrating

5.4 Conflict of Interest Disclosure

6. Discussion

6.1 Implications for Practice

6.2 Implications for Tools

6.3 Implications for Research

6.4 Limitations

6.5 Threats to Validity

7. Conclusion

References