Skip to content

Generation Overview

Purpose: Generation product area overview Detail Level: Full reference


How does code become docs? The generation pipeline transforms annotated source code into markdown documents. It follows a four-stage architecture: Scanner → Extractor → Transformer → Codec. Codecs are pure functions — given a MasterDataset, they produce a RenderableDocument without side effects. CompositeCodec composes multiple codecs into a single document.

  • Codec purity: Every codec is a pure function (dataset in, document out). No side effects, no filesystem access. Same input always produces same output
  • Config-driven generation: A single ReferenceDocConfig produces a complete document. Content sources compose in fixed order: conventions, diagrams, shapes, behaviors
  • RenderableDocument IR: Codecs express intent (“this is a table”), the renderer handles syntax (“pipe-delimited markdown”). Switching output format requires only a new renderer

Scoped architecture diagram showing component relationships:

graph TB
    subgraph generator["Generator"]
        SourceMapper[/"SourceMapper"/]
        Documentation_Generation_Orchestrator("Documentation Generation Orchestrator")
        TransformDataset("TransformDataset")
        DecisionDocGenerator("DecisionDocGenerator")
    end
    subgraph renderer["Renderer"]
        PatternsCodec[("PatternsCodec")]
        DecisionDocCodec[("DecisionDocCodec")]
        CompositeCodec[("CompositeCodec")]
        ArchitectureCodec[("ArchitectureCodec")]
    end
    subgraph related["Related"]
        MasterDataset["MasterDataset"]:::neighbor
        Pattern_Scanner["Pattern Scanner"]:::neighbor
        GherkinASTParser["GherkinASTParser"]:::neighbor
        ShapeExtractor["ShapeExtractor"]:::neighbor
        ReferenceDocShowcase["ReferenceDocShowcase"]:::neighbor
        PatternRelationshipModel["PatternRelationshipModel"]:::neighbor
    end
    SourceMapper -.->|depends on| DecisionDocCodec
    SourceMapper -.->|depends on| ShapeExtractor
    SourceMapper -.->|depends on| GherkinASTParser
    Documentation_Generation_Orchestrator -->|uses| Pattern_Scanner
    PatternsCodec ..->|implements| PatternRelationshipModel
    CompositeCodec ..->|implements| ReferenceDocShowcase
    ArchitectureCodec -->|uses| MasterDataset
    TransformDataset -->|uses| MasterDataset
    TransformDataset ..->|implements| PatternRelationshipModel
    DecisionDocGenerator -.->|depends on| DecisionDocCodec
    DecisionDocGenerator -.->|depends on| SourceMapper
    classDef neighbor stroke-dasharray: 5 5

/**
* Runtime MasterDataset with optional workflow
*
* Extends the Zod-compatible MasterDataset with workflow reference.
* LoadedWorkflow contains Maps which aren't JSON-serializable,
* so it's kept separate from the Zod schema.
*/
interface RuntimeMasterDataset extends MasterDataset {
/** Optional workflow configuration (not serializable) */
readonly workflow?: LoadedWorkflow;
}
PropertyDescription
workflowOptional workflow configuration (not serializable)
/**
* Raw input data for transformation
*/
interface RawDataset {
/** Extracted patterns from TypeScript and/or Gherkin sources */
readonly patterns: readonly ExtractedPattern[];
/** Tag registry for category lookups */
readonly tagRegistry: TagRegistry;
/** Optional workflow configuration for phase names (can be undefined) */
readonly workflow?: LoadedWorkflow | undefined;
/** Optional rules for inferring bounded context from file paths */
readonly contextInferenceRules?: readonly ContextInferenceRule[] | undefined;
}
PropertyDescription
patternsExtracted patterns from TypeScript and/or Gherkin sources
tagRegistryTag registry for category lookups
workflowOptional workflow configuration for phase names (can be undefined)
contextInferenceRulesOptional rules for inferring bounded context from file paths
type RenderableDocument = {
title: string;
purpose?: string;
detailLevel?: string;
sections: SectionBlock[];
additionalFiles?: Record<string, RenderableDocument>;
};
type SectionBlock =
| HeadingBlock
| ParagraphBlock
| SeparatorBlock
| TableBlock
| ListBlock
| CodeBlock
| MermaidBlock
| CollapsibleBlock
| LinkOutBlock;
type HeadingBlock = z.infer<typeof HeadingBlockSchema>;
type TableBlock = z.infer<typeof TableBlockSchema>;
type ListBlock = z.infer<typeof ListBlockSchema>;
type CodeBlock = z.infer<typeof CodeBlockSchema>;
type MermaidBlock = z.infer<typeof MermaidBlockSchema>;
type CollapsibleBlock = {
type: 'collapsible';
summary: string;
content: SectionBlock[];
};
/**
* Transform raw extracted data into a MasterDataset with all pre-computed views.
*
* This is a ONE-PASS transformation that computes:
* - Status-based groupings (completed/active/planned)
* - Phase-based groupings with counts
* - Quarter-based groupings for timeline views
* - Category-based groupings for taxonomy
* - Source-based views (TypeScript vs Gherkin, roadmap, PRD)
* - Aggregate statistics (counts, phase count, category count)
* - Optional relationship index
*
* For backward compatibility, this function returns just the dataset.
* Use `transformToMasterDatasetWithValidation` to get validation summary.
*
* @param raw - Raw dataset with patterns, registry, and optional workflow
* @returns MasterDataset with all pre-computed views
*
* @example
* ```typescript
* const masterDataset = transformToMasterDataset({
* patterns: mergedPatterns,
* tagRegistry: registry,
* workflow,
* });
*
* // Access pre-computed views
* const completed = masterDataset.byStatus.completed;
* const phase3Patterns = masterDataset.byPhase.find(p => p.phaseNumber === 3);
* const q42024 = masterDataset.byQuarter["Q4-2024"];
* ```
*/
function transformToMasterDataset(raw: RawDataset): RuntimeMasterDataset;
ParameterTypeDescription
rawRaw dataset with patterns, registry, and optional workflow

Returns: MasterDataset with all pre-computed views


72 patterns, 344 rules with invariants (344 total)

RuleInvariantRationale
Codecs implement a decode-only contractEvery codec is a pure function that accepts a MasterDataset and returns a RenderableDocument. Codecs do not perform side effects, do not write files, and do not access the filesystem. The codec contract is decode-only because the transformation is one-directional: structured data becomes a document, never the reverse.Pure functions are deterministic and trivially testable. For the same MasterDataset, a codec always produces the same RenderableDocument. This makes snapshot testing reliable and enables codec output comparison across versions.
RenderableDocument is a typed intermediate representationRenderableDocument contains a title, an ordered array of SectionBlock elements, and an optional record of additional files. Each SectionBlock is a discriminated union: heading, paragraph, table, code, list, separator, or metaRow. The renderer consumes this IR without needing to know which codec produced it.A typed IR decouples codecs from rendering. Codecs express intent (“this is a table with these rows”) and the renderer handles syntax (“pipe-delimited markdown with separator row”). This means switching output format (e.g., HTML instead of markdown) requires only a new renderer, not changes to every codec.
CompositeCodec assembles documents from child codecsCompositeCodec accepts an array of child codecs and produces a single RenderableDocument by concatenating their sections. Child codec order determines section order in the output. Separators are inserted between children by default.Reference documents combine content from multiple domains (patterns, conventions, shapes, diagrams). Rather than building a monolithic codec that knows about all content types, CompositeCodec lets each domain own its codec and composes them declaratively.
ADR content comes from both Feature description and Rule prefixesADR structured content (Context, Decision, Consequences) can appear in two locations within a feature file. Both sources must be rendered. Silently dropping either source causes content loss.Early ADRs used name prefixes like “Context - …” and “Decision - …” on Rule blocks to structure content. Later ADRs placed Context, Decision, and Consequences as bold-annotated prose in the Feature description, reserving Rule: blocks for invariants and design rules. Both conventions are valid. The ADR codec must handle both because the codebase contains ADRs authored in each style. The Feature description lives in pattern.directive.description. If the codec only renders Rules (via partitionRulesByPrefix), then Feature description content is silently dropped — no error, no warning. This caused confusion across two repos where ADR content appeared in the feature file but was missing from generated docs. The fix renders pattern.directive.description in buildSingleAdrDocument between the Overview metadata table and the partitioned Rules section, using renderFeatureDescription() which walks content linearly and handles prose, tables, and DocStrings with correct interleaving.
The markdown renderer is codec-agnosticThe renderer accepts any RenderableDocument regardless of which codec produced it. Rendering depends only on block types, not on document origin. This enables testing codecs and renderers independently.If the renderer knew about specific codecs, adding a new codec would require renderer changes. By operating purely on the SectionBlock discriminated union, the renderer is closed for modification but open for extension via new block types.
RuleInvariantRationale
All feature consumers query the read model, not raw stateCode that needs pattern relationships, status groupings, cross-source resolution, or dependency information consumes the MasterDataset. Direct scanner/extractor imports are permitted only in pipeline orchestration code that builds the MasterDataset.Bypassing the read model forces consumers to re-derive data that the MasterDataset already computes, creating duplicate logic and divergent behavior when the pipeline evolves. Exception: lint-patterns.ts is a pure stage-1 consumer. It validates annotation syntax on scanned files. No relationships, no cross-source resolution. Direct scanner consumption is correct for that use case.
No lossy local typesConsumers do not define local DTOs that duplicate and discard fields from ExtractedPattern. If a consumer needs a subset, the type system provides the projection — not a hand-written extraction function that becomes a barrier between the consumer and canonical data.Lossy local types silently drop fields that later become needed, causing bugs that only surface when new MasterDataset capabilities are added and the local type lacks them.
Relationship resolution is computed onceForward relationships (uses, dependsOn, implementsPatterns) and reverse lookups (usedBy, implementedBy, extendedBy) are computed in transformToMasterDataset(). No consumer re-derives these from raw pattern arrays or scanned file tags.Re-deriving relationships in consumers duplicates the resolution logic and risks inconsistency when different consumers implement subtly different traversal or filtering rules.
Three named anti-patternsThese are recognized violations, serving as review criteria for new code and refactoring targets for existing code.Without named anti-patterns, violations appear as one-off style issues rather than systematic architectural drift, making them harder to detect and communicate in code review. Naming them makes them visible in code review — including AI-assisted sessions where the default proposal is often “add a helper function.”
RuleInvariantRationale
Architecture generator is registered in the registryThe generator registry must contain an “architecture” generator entry available for CLI invocation.Without a registered entry, the CLI cannot discover or invoke architecture diagram generation.
Architecture generator produces component diagram by defaultRunning the architecture generator without diagram type options must produce a component diagram with bounded context subgraphs.A sensible default prevents users from needing to specify options for the most common use case.
Architecture generator supports diagram type optionsThe architecture generator must accept a diagram type option that selects between component and layered diagram output.Different architectural perspectives (bounded context vs. layer hierarchy) require different diagram types, and the user must be able to select which to generate.
Architecture generator supports context filteringWhen context filtering is applied, the generated diagram must include only patterns from the specified bounded contexts and exclude all others.Without filtering, large monorepos would produce unreadable diagrams with dozens of bounded contexts; filtering enables focused per-context views.
RuleInvariantRationale
archIndex groups patterns by arch-roleEvery pattern with an arch-role tag must appear in the archIndex.byRole map under its role key.Diagram generators need O(1) lookup of patterns by role to render role-based groupings efficiently.
archIndex groups patterns by arch-contextEvery pattern with an arch-context tag must appear in the archIndex.byContext map under its context key.Component diagrams render bounded context subgraphs and need patterns grouped by context.
archIndex groups patterns by arch-layerEvery pattern with an arch-layer tag must appear in the archIndex.byLayer map under its layer key.Layered diagrams render layer subgraphs and need patterns grouped by architectural layer.
archIndex.all contains all patterns with any arch tagarchIndex.all must contain exactly the set of patterns that have at least one arch tag (role, context, or layer).Consumers iterating over all architectural patterns need a single canonical list; omitting partially-tagged patterns would silently drop them from diagrams.
Patterns without arch tags are excluded from archIndexPatterns lacking all three arch tags (role, context, layer) must not appear in any archIndex view.Including non-architectural patterns would pollute diagrams with irrelevant components.
RuleInvariantRationale
Layered diagrams group patterns by architectural layerLayered diagrams must render patterns grouped by architectural layer (domain, application, infrastructure) with top-to-bottom flow.Layered architecture visualization shows dependency direction - infrastructure at top, domain at bottom - following conventional layer ordering.
Architecture generator is registered with generator registryAn “architecture” generator must be registered with the generator registry to enable pnpm docs:architecture via the existing generate-docs.js CLI.The delivery-process uses a generator registry pattern. New generators register with the orchestrator rather than creating separate CLI commands.
Sequence diagrams render interaction flowsSequence diagrams must render interaction flows (command flow, saga flow) showing step-by-step message passing between components.Component diagrams show structure but not behavior. Sequence diagrams show runtime flow - essential for understanding command/saga execution.
RuleInvariantRationale
Architecture tags exist in the tag registryThree architecture-specific tags (arch-role, arch-context, arch-layer) must exist in the tag registry with correct format and enum values.Architecture diagram generation requires metadata to classify source files into diagram components. Standard tag infrastructure enables consistent extraction via the existing AST parser.
AST parser extracts architecture tags from TypeScriptThe AST parser must extract arch-role, arch-context, and arch-layer tags from TypeScript JSDoc comments into DocDirective objects.Source code annotations are the single source of truth for architectural metadata. Parser must extract them alongside existing pattern metadata.
MasterDataset builds archIndex during transformationThe transformToMasterDataset function must build an archIndex that groups patterns by role, context, and layer for efficient diagram generation.Single-pass extraction during dataset transformation avoids expensive re-traversal. Index structure enables O(1) lookup by each dimension.
Component diagrams group patterns by bounded contextComponent diagrams must render patterns as nodes grouped into bounded context subgraphs, with relationship arrows using UML-inspired styles.Component diagrams visualize system architecture showing how bounded contexts isolate components. Subgraphs enforce visual separation.
RuleInvariantRationale
arch-role tag is defined in the registryThe tag registry must contain an arch-role tag with enum format and all valid architectural role values.Without a registry-defined arch-role tag, the extractor cannot validate role values and diagrams may render invalid roles.
arch-context tag is defined in the registryThe tag registry must contain an arch-context tag with value format for free-form bounded context names.Without a registry-defined arch-context tag, bounded context groupings cannot be validated and diagrams may contain arbitrary context names.
arch-layer tag is defined in the registryThe tag registry must contain an arch-layer tag with enum format and exactly three values: domain, application, infrastructure.Allowing arbitrary layer values would break the fixed Clean Architecture ordering that layered diagrams depend on.
AST parser extracts arch-role from TypeScript annotationsThe AST parser must extract the arch-role value from JSDoc annotations and populate the directive’s archRole field.If arch-role is not extracted, patterns cannot be classified by architectural role and diagram node styling is lost.
AST parser extracts arch-context from TypeScript annotationsThe AST parser must extract the arch-context value from JSDoc annotations and populate the directive’s archContext field.If arch-context is not extracted, component diagrams cannot group patterns into bounded context subgraphs.
AST parser extracts arch-layer from TypeScript annotationsThe AST parser must extract the arch-layer value from JSDoc annotations and populate the directive’s archLayer field.If arch-layer is not extracted, layered diagrams cannot group patterns into domain/application/infrastructure subgraphs.
AST parser handles multiple arch tags togetherWhen a JSDoc block contains arch-role, arch-context, and arch-layer tags, all three must be extracted into the directive.Partial extraction would cause components to be missing from role, context, or layer groupings depending on which tag was dropped.
Missing arch tags yield undefined valuesArch tag fields absent from a JSDoc block must be undefined in the extracted directive, not null or empty string.Downstream consumers distinguish between “not annotated” (undefined) and “annotated with empty value” to avoid rendering ghost nodes.
RuleInvariantRationale
Extracts Rule blocks with Invariant and RationaleAnnotated Rule blocks must have their Invariant, Rationale, and Verified-by fields faithfully extracted and rendered.These structured annotations are the primary content of business rules documentation; losing them silently produces incomplete output.
Organizes rules by product area and phaseRules must be grouped by product area and ordered by phase number within each group.Ungrouped or misordered rules make it impossible to find domain-specific constraints or understand their delivery sequence.
Summary mode generates compact outputSummary mode must produce only a statistics line and omit all detailed rule headings and content.AI context windows have strict token limits; including full detail in summary mode wastes context budget and degrades session quality.
Preserves code examples and tables in detailed modeCode examples must appear only in detailed mode and must be excluded from standard mode output.Code blocks in standard mode clutter the overview and push important rule summaries out of view; detailed mode is the opt-in path for full content.
Generates scenario traceability linksVerification links must include the source file path so readers can locate the verifying scenario.Links without file paths are unresolvable, breaking the traceability chain between business rules and their executable specifications.
Progressive disclosure generates detail files per product areaEach product area with rules must produce a separate detail file, and the main document must link to all detail files via an index table.A single monolithic document becomes unnavigable at scale; progressive disclosure lets readers drill into only the product area they need.
Empty rules show placeholder instead of blank contentRules with no invariant, description, or scenarios must render a placeholder message; rules with scenarios but no invariant must show the verified-by list instead.Blank rule sections are indistinguishable from rendering bugs; explicit placeholders signal intentional incompleteness versus broken extraction.
Rules always render flat for full visibilityRule output must never use collapsible blocks regardless of rule count; all rule headings must be directly visible.Business rules are compliance-critical content; hiding them behind collapsible sections risks rules being overlooked during review.
Source file shown as filename textSource file references must render as plain filename text, not as markdown links.Markdown links to local file paths break in every viewer except the local filesystem, producing dead links that erode trust in the documentation.
Verified-by renders as checkbox list at standard levelVerified-by must render as a checkbox list of scenario names, with duplicate names deduplicated.Duplicate entries inflate the checklist and mislead reviewers into thinking more verification exists than actually does.
Feature names are humanized from camelCase pattern namesCamelCase pattern names must be converted to space-separated headings with trailing “Testing” suffixes stripped.Raw camelCase names are unreadable in documentation headings, and “Testing” suffixes leak implementation concerns into user-facing output.
RuleInvariantRationale
Extracts Rule blocks with Invariant and RationaleEvery Rule: block with **Invariant:** annotation must be extracted. Rules without annotations are included with rule name only.Business rules are the core domain constraints. Extracting them separately from acceptance criteria creates a focused reference document for domain understanding.
Organizes rules by domain category and phaseRules are grouped first by domain category (from @libar-docs-* flags), then by phase number for temporal ordering.Domain-organized documentation helps stakeholders find rules relevant to their area of concern without scanning all rules.
Preserves code examples and comparison tablesDocStrings ("""typescript) and tables in Rule descriptions are rendered in the business rules document.Code examples and tables provide concrete understanding of abstract rules. Removing them loses critical context.
Generates scenario traceability linksEach rule’s **Verified by:** section generates links to the scenarios that verify the rule.Traceability enables audit compliance and helps developers find relevant tests when modifying rules.
RuleInvariantRationale
Claude module tags exist in the tag registryThree claude-specific tags (claude-module, claude-section, claude-tags) must exist in the tag registry with correct format and values.Module generation requires metadata to determine output path, section placement, and variation filtering. Standard tag infrastructure enables consistent extraction via the existing Gherkin parser.
Gherkin parser extracts claude module tags from feature filesThe Gherkin extractor must extract claude-module, claude-section, and claude-tags from feature file tags into ExtractedPattern objects.Behavior specs are the source of truth for CLAUDE.md module content. Parser must extract module metadata alongside existing pattern metadata.
Module content is extracted from feature file structureThe codec must extract content from standard feature file elements: Feature description (Problem/Solution), Rule blocks, and Scenario Outline Examples.Behavior specs already contain well-structured, prescriptive content. The extraction preserves structure rather than flattening to prose.
ClaudeModuleCodec produces compact markdown modulesThe codec transforms patterns with claude tags into markdown files suitable for the _claude-md/ directory structure.CLAUDE.md modules must be compact and actionable. The codec produces ready-to-use markdown without truncation (let modular-claude-md handle token budget warnings).
Claude module generator writes files to correct locationsThe generator must write module files to {outputDir}/{section}/{module}.md based on the claude-section and claude-module tags.Output path structure must match modular-claude-md expectations. The claude-section determines the subdirectory, claude-module determines filename.
Claude module generator is registered with generator registryA “claude-modules” generator must be registered with the generator registry to enable pnpm docs:claude-modules via the existing CLI.Consistent with architecture-diagram-generation pattern. New generators register with the orchestrator rather than creating separate commands.
Same source generates detailed docs with progressive disclosureWhen running with detailLevel: "detailed", the codec produces expanded documentation including all Rule content, code examples, and scenario details.Single source generates both compact modules (AI context) and detailed docs (human reference). Progressive disclosure is already a codec capability.
RuleInvariantRationale
CodecBasedGenerator adapts codecs to generator interfaceCodecBasedGenerator delegates document generation to the underlying codec and surfaces codec errors through the generator interface.The adapter pattern enables codec-based rendering to integrate with the existing orchestrator without modifying either side.
RuleInvariantRationale
Timeline codecs group patterns by phase and statusRoadmap shows planned work, Milestones shows completed work, CurrentWork shows active patterns only.Mixing statuses across timeline views would bury actionable information and make it impossible to distinguish planned from delivered work.
Session codecs provide working context for AI sessionsSessionContext shows active patterns with deliverables. RemainingWork aggregates incomplete work by phase.AI sessions without curated context waste tokens on irrelevant patterns, and unaggregated remaining work obscures project health.
Requirements codec produces PRD-style documentationFeatures include problem, solution, business value. Acceptance criteria are formatted with bold keywords.Omitting problem/solution context produces specs that lack justification, and unformatted acceptance criteria are difficult to scan.
Reporting codecs support release management and auditingChangelog follows Keep a Changelog format. Traceability maps rules to scenarios.Non-standard changelog formats break tooling that parses release notes, and unmapped rules represent unverified business constraints.
Planning codecs support implementation sessionsPlanning checklist includes DoD items. Session plan shows implementation steps.Missing DoD items in checklists allow incomplete patterns to pass validation, and sessions without implementation steps lose focus.
RuleInvariantRationale
Config-driven codec replaces per-document recipe featuresA single ReferenceDocConfig object is sufficient to produce a complete reference document. No per-document codec subclass or recipe feature is required.The codec composition logic is identical across all reference documents. Only the content sources differ. Extracting this into a config-driven factory eliminates N duplicated recipe features and makes adding new documents a one-line config addition.
Four content sources compose in AD-5 orderReference documents always compose content in this order: conventions, then scoped diagrams, then shapes, then behaviors. Empty sources are omitted without placeholder sections.AD-5 established that conceptual context (conventions and architectural diagrams) should precede implementation details (shapes and behaviors). This reading order helps developers understand the “why” before the “what”.
Detail level controls output densityThree detail levels produce progressively more content from the same config. Summary: type tables only, no diagrams, no narrative. Standard: narrative and code examples, no rationale. Detailed: full rationale, property documentation, and scoped diagrams.AI context windows need compact summaries. Human readers need full documentation. The same config serves both audiences by parameterizing the detail level at generation time.
Generator registration produces paired detailed and summary outputsEach ReferenceDocConfig produces exactly two generators (detailed for docs/, summary for _claude-md/) plus a meta-generator that invokes all pairs. Total: N configs x 2 + 1 = 2N + 1 generators.Every reference document needs both a human-readable detailed version and an AI-optimized compact version. The meta-generator enables pnpm docs:all to produce every reference document in one pass.
RuleInvariantRationale
Component diagrams group patterns by bounded contextEach distinct arch-context value must produce exactly one Mermaid subgraph containing all patterns with that context.Without subgraph grouping, the visual relationship between components and their bounded context is lost, making the diagram structurally meaningless.
Context-less patterns go to Shared InfrastructurePatterns without an arch-context value must be placed in a “Shared Infrastructure” subgraph, never omitted from the diagram.Cross-cutting infrastructure components (event bus, logger) belong to no bounded context but must still appear in the diagram.
Relationship types render with distinct arrow stylesEach relationship type must render with its designated Mermaid arrow style: uses (—>), depends-on (-.->), implements (..->), extends (—>>).Distinct arrow styles convey dependency semantics visually; conflating them loses architectural information.
Arrows only connect annotated componentsRelationship arrows must only be rendered when both source and target patterns exist in the architecture index.Rendering an arrow to a non-existent node would produce invalid Mermaid syntax or dangling references.
Component diagram includes summary sectionThe generated component diagram document must include an Overview section with component count and bounded context count.Without summary counts, readers cannot quickly assess diagram scope or detect missing components.
Component diagram includes legend when enabledWhen the legend is enabled, the document must include a Legend section explaining relationship arrow styles.Without a legend, readers cannot distinguish uses, depends-on, implements, and extends arrows, making relationship semantics ambiguous.
Component diagram includes inventory table when enabledWhen the inventory is enabled, the document must include a Component Inventory table with Component, Context, Role, and Layer columns.The inventory provides a searchable, text-based alternative to the visual diagram for tooling and accessibility.
Empty architecture data shows guidance messageWhen no patterns have architecture annotations, the document must display a guidance message explaining how to add arch tags.An empty diagram with no explanation would be confusing; guidance helps users onboard to the annotation system.
RuleInvariantRationale
CompositeCodec concatenates sections in codec array orderSections from child codecs appear in the composite output in the same order as the codecs array.Non-deterministic section ordering would make generated documents unstable across runs, breaking diff-based review workflows.
Separators between codec outputs are configurableBy default, a separator block is inserted between each child codec’s sections. When separateSections is false, no separators are added.Without configurable separators, consumers cannot control visual grouping — some documents need clear boundaries between codec outputs while others need seamless flow.
additionalFiles merge with last-wins semanticsadditionalFiles from all children are merged into a single record. When keys collide, the later codec’s value wins.Silently dropping colliding keys would lose content without warning, while throwing on collision would prevent composing codecs that intentionally override shared file paths.
composeDocuments works at document level without codecscomposeDocuments accepts RenderableDocument array and produces a composed RenderableDocument without requiring codecs.Requiring a full codec instance for simple document merging would force unnecessary schema definitions when callers already hold pre-rendered documents.
Empty codec outputs are handled gracefullyCodecs producing empty sections arrays contribute nothing to the output. No separator is emitted for empty outputs.Emitting separators around empty sections would produce orphaned dividers in the generated markdown, creating visual noise with no content between them.
RuleInvariantRationale
Duplicate detection uses content fingerprintingContent with identical normalized text must produce identical fingerprints.Fingerprinting enables efficient duplicate detection without full text comparison.
Duplicates are merged based on source priorityHigher-priority sources take precedence when merging duplicate content.TypeScript sources have richer JSDoc; feature files provide behavioral context.
Section order is preserved after deduplicationSection order matches the source mapping table order after deduplication.Predictable ordering ensures consistent documentation structure.
Deduplicator integrates with source mapper pipelineDeduplication runs after extraction and before document assembly.All content must be extracted before duplicates can be identified.
RuleInvariantRationale
Empty and missing inputs produce empty resultsExtraction with no tags or no matching patterns always produces an empty result.Callers must be able to distinguish “no conventions found” from errors without special-casing nulls or exceptions.
Convention bundles are extracted from matching patternsEach unique convention tag produces exactly one bundle, and patterns sharing a tag are merged into that bundle.Without tag-based grouping and merging, convention content would be fragmented across duplicates, making downstream rendering unreliable.
Structured content is extracted from rule descriptionsInvariant, rationale, and table content embedded in rule descriptions must be extracted as structured metadata, not raw text.Downstream renderers depend on structured fields to produce consistent documentation; unstructured text would require re-parsing at every consumption point.
Code examples in rule descriptions are preservedFenced code blocks (including Mermaid diagrams) in rule descriptions must be extracted as typed code examples and never discarded.Losing code examples during extraction would silently degrade generated documentation, removing diagrams and samples authors intended to publish.
TypeScript JSDoc conventions are extracted alongside GherkinTypeScript JSDoc and Gherkin convention sources sharing the same tag must merge into a single bundle with all rules preserved from both sources.Conventions are defined across both TypeScript and Gherkin; failing to merge them would split a single logical convention into incomplete fragments.
RuleInvariantRationale
Include tag routes content to named documentsA pattern or shape with libar-docs-include:X appears in any reference document whose includeTags contains X. The tag is CSV, so libar-docs-include:X,Y routes the item to both document X and document Y. This is additive — the item also appears in any document whose existing selectors (conventionTags, behaviorCategories, shapeSelectors) would already select it.Content-to-document is a many-to-many relationship. A type definition may be relevant to an architecture overview, a configuration guide, and an AI context section. The include tag expresses this routing at the source, next to the code, without requiring the document configs to enumerate every item by name.
Include tag scopes diagrams (replaces arch-view)DiagramScope.include matches patterns whose libar-docs-include values contain the specified scope value. This is the same field that existed as archView — renamed for consistency with the general-purpose include tag. Patterns with libar-docs-include:pipeline-stages appear in any DiagramScope with include: pipeline-stages.The experimental arch-view tag was diagram-specific routing under a misleading name. Renaming to include unifies the vocabulary: one tag, two consumption points (diagram scoping via DiagramScope.include, content routing via ReferenceDocConfig.includeTags).
Shapes use include tag for document routingA declaration tagged with both libar-docs-shape and libar-docs-include has its include values stored on the ExtractedShape. The reference codec uses these values alongside shapeSelectors for shape filtering. A shape with libar-docs-include:X appears in any document whose includeTags contains X, regardless of whether the shape matches any shapeSelector.Shape extraction (via libar-docs-shape) and document routing (via libar-docs-include) are orthogonal concerns. A shape must be extracted before it can be routed. The shape tag triggers extraction; the include tag controls which documents render it. This separation allows one shape to appear in multiple documents without needing multiple group values.
Conventions use include tag for selective inclusionA decision record or convention pattern with libar-docs-include:X appears in a reference document whose includeTags contains X. This allows selecting a single convention rule for a focused document without pulling all conventions matching a broad conventionTag.Convention content is currently selected by conventionTags, which pulls all decision records tagged with a given convention value. For showcase documents or focused guides, this is too coarse. The include tag enables cherry-picking individual conventions alongside broad tag-based selection.
RuleInvariantRationale
Rule blocks are partitioned by semantic prefixDecision document rules must be partitioned into ADR sections based on their semantic prefix (e.g., “Decision:”, “Context:”, “Consequence:”), with non-standard rules placed in an “other” category.Semantic partitioning produces structured ADR output that follows the standard ADR format — unpartitioned rules would generate a flat, unnavigable document.
DocStrings are extracted with language tagsDocStrings within rule descriptions must be extracted preserving their language tag (e.g., typescript, bash), defaulting to “text” when no language is specified.Language tags enable syntax highlighting in generated markdown code blocks — losing the tag produces unformatted code that is harder to read.
Source mapping tables are parsed from rule descriptionsMarkdown tables in rule descriptions with source mapping columns must be parsed into structured data, returning empty arrays when no table is present.Source mapping tables drive the extraction pipeline — they define which files to read and what content to extract for each decision section.
Self-reference markers are correctly detectedThe “THIS DECISION” marker must be recognized as a self-reference to the current decision document, with optional rule name qualifiers parsed correctly.Self-references enable decisions to extract content from their own rules — misdetecting them would trigger file-system lookups for a non-existent “THIS DECISION” file.
Extraction methods are normalized to known typesExtraction method strings from source mapping tables must be normalized to canonical method names for dispatcher routing.Users may write extraction methods in various formats (e.g., “Decision rule description”, “extract-shapes”) — normalization ensures consistent dispatch regardless of formatting.
Complete decision documents are parsed with all contentA complete decision document must be parseable into its constituent parts including rules, DocStrings, source mappings, and self-references in a single parse operation.Complete parsing validates that all codec features compose correctly — partial parsing could miss interactions between features.
Rules can be found by name with partial matchingRules must be findable by exact name match or partial (substring) name match, returning undefined when no match exists.Partial matching supports flexible cross-references between decisions — requiring exact matches would make references brittle to minor naming changes.
RuleInvariantRationale
Output paths are determined from pattern metadataOutput file paths must be derived from pattern metadata using kebab-case conversion of the pattern name, with configurable section prefixes.Consistent path derivation ensures generated files are predictable and linkable — ad-hoc paths would break cross-document references.
Compact output includes only essential contentCompact output mode must include only essential decision content (type shapes, key constraints) while excluding full descriptions and verbose sections.Compact output is designed for AI context windows where token budget is limited — including full descriptions would negate the space savings.
Detailed output includes full contentDetailed output mode must include all decision content including full descriptions, consequences, and DocStrings rendered as code blocks.Detailed output serves as the complete human reference — omitting any section would force readers to consult source files for the full picture.
Multi-level generation produces both outputsThe generator must produce both compact and detailed output files from a single generation run, using the pattern name or patternName tag as the identifier.Both output levels serve different audiences (AI vs human) — generating them together ensures consistency and eliminates the risk of one becoming stale.
Generator is registered with the registryThe decision document generator must be registered with the generator registry under a canonical name and must filter input patterns to only those with source mappings.Registry registration enables discovery via —list-generators — filtering to source-mapped patterns prevents empty output for patterns without decision metadata.
Source mappings are executed during generationSource mapping tables must be executed during generation to extract content from referenced files, with missing files reported as validation errors.Source mappings are the bridge between decision specs and implementation — unexecuted mappings produce empty sections, while silent missing-file errors hide broken references.
RuleInvariantRationale
Tabs are normalized to spaces before dedentTab characters must be converted to spaces before calculating the minimum indentation level.Mixing tabs and spaces produces incorrect indentation calculations — normalizing first ensures consistent dedent depth.
Empty lines are handled correctlyEmpty lines (including lines with only whitespace) must not affect the minimum indentation calculation and must be preserved in output.Counting whitespace-only lines as indented content would inflate the minimum indentation, causing non-empty lines to retain unwanted leading spaces.
Single line input is handledSingle-line input must have its leading whitespace removed without errors or unexpected transformations.Failing or returning empty output on single-line input would break callers that extract individual lines from multi-line DocStrings.
Unicode whitespace is handledNon-breaking spaces and other Unicode whitespace characters must be treated as content, not as indentation to be removed.Stripping Unicode whitespace as indentation would corrupt intentional formatting in source code and documentation content.
Relative indentation is preservedAfter removing the common leading whitespace, the relative indentation between lines must remain unchanged.Altering relative indentation would break the syntactic structure of extracted code blocks, making them unparseable or semantically incorrect.
RuleInvariantRationale
Leading headers are stripped from pattern descriptionsMarkdown headers at the start of a pattern description are removed before rendering to prevent duplicate headings under the Description section.The codec already emits a ”## Description” header; preserving the source header would create a redundant or conflicting heading hierarchy.
Edge cases are handled correctlyHeader stripping handles degenerate inputs (header-only, whitespace-only, mid-description headers) without data loss or rendering errors.Patterns with unusual descriptions (header-only stubs, whitespace padding) are common in early roadmap stages; crashing on these would block documentation generation for the entire dataset.
stripLeadingHeaders removes only leading headersThe helper function strips only headers that appear before any non-header content; headers occurring after body text are preserved.Mid-description headers are intentional structural elements authored by the user; stripping them would silently destroy document structure.
RuleInvariantRationale
Behavior files are verified during pattern extractionEvery timeline pattern must report whether its corresponding behavior file exists.Without verification at extraction time, traceability reports would silently include broken references to non-existent behavior files.
Traceability coverage reports verified and unverified behavior filesCoverage reports must distinguish between patterns with verified behavior files and those without.Conflating verified and unverified coverage would overstate test confidence, hiding gaps that should be addressed before release.
Pattern names are transformed to human-readable display namesDisplay names must convert CamelCase to title case, handle consecutive capitals, and respect explicit title overrides.CamelCase identifiers are unreadable in generated documentation; human-readable names are essential for non-developer consumers of pattern registries.
PRD acceptance criteria are formatted with numbering and bold keywordsPRD output must number acceptance criteria and bold Given/When/Then keywords when steps are enabled.Unnumbered criteria are difficult to reference in reviews; unformatted step keywords blend into prose, making scenarios harder to parse visually.
Business values are formatted for human readabilityHyphenated business value tags must be converted to space-separated readable text in all output contexts.Raw hyphenated tags like “enable-rich-prd” are annotation artifacts; displaying them verbatim in generated docs confuses readers expecting natural language.
RuleInvariantRationale
Context - Manual documentation maintenance does not scaleDocumentation must be generated from annotated source code, never manually maintained as a separate artifact.Manual documentation drifts from source as the codebase evolves, creating stale references that mislead both humans and AI coding sessions.
Decision - Decisions own convention content and durable context, code owns detailsEach content type (intro/rationale, rules/examples, API types) is owned by exactly one source type (decision, behavior spec, or code).Shared ownership leads to conflicting updates and ambiguous authority over what the “correct” version is.
Proof of Concept - Self-documentation validates the patternThe documentation generation pattern must be validated by generating documentation about itself from its own annotated sources.A self-referential proof of concept exposes extraction gaps and source mapping issues that synthetic test data would miss. This POC demonstrates the doc-from-decision pattern by generating docs about ITSELF. The DocGenerationProofOfConcept pattern produces:
Expected Output - Compact claude module structureCompact output must contain only essential tables and type names, with no JSDoc comments or implementation details.AI context windows are finite; including non-essential content displaces actionable information and degrades session effectiveness.
Consequences - Durable sources with clear ownership boundariesDecision documents remain the authoritative source for intro, rationale, and convention content until explicitly superseded.Without durable ownership, documentation sections lose their authoritative source and degrade into unattributed prose that no one updates.
Consequences - Design stubs live in stubs, not srcPre-implementation design stubs must reside in delivery-process/stubs/, never in src/.Stubs in src/ require ESLint exceptions, create confusion between production and design code, and risk accidental imports of unimplemented functions.
Decision - Source mapping table parsing and extraction method dispatchThe source mapping table in a decision document defines how documentation sections are assembled from multiple source files.Without a declarative mapping, generators must hard-code source-to-section relationships, making the system brittle to new document types.
RuleInvariantRationale
Orchestrator coordinates full documentation generation pipelineNon-overlapping patterns from TypeScript and Gherkin sources must merge into a unified dataset; overlapping pattern names must fail with conflict error.Silent merging of conflicting patterns would produce incorrect documentation — fail-fast ensures data integrity across the pipeline.
RuleInvariantRationale
Single-line descriptions are returned as-is when completeA single-line description that ends with sentence-ending punctuation is returned verbatim; one without gets an appended ellipsis.Summaries appear in pattern tables where readers expect grammatically complete text; an ellipsis signals intentional truncation rather than a rendering bug.
Multi-line descriptions are combined until sentence endingLines are concatenated until a sentence-ending punctuation mark is found or the character limit is reached, whichever comes first.Splitting at arbitrary line breaks produces sentence fragments that lose meaning; combining until a natural boundary preserves semantic completeness.
Long descriptions are truncated at sentence or word boundariesSummaries exceeding the character limit are truncated at the nearest sentence boundary if possible, otherwise at a word boundary with an appended ellipsis.Sentence-boundary truncation preserves semantic completeness; word-boundary fallback avoids mid-word breaks.
Tautological and header lines are skippedLines that merely repeat the pattern name or consist only of a section header label (e.g., “Problem:”, “Solution:”) are skipped; the summary begins with the first substantive line.Tautological opening lines waste the limited summary space without adding information.
Edge cases are handled gracefullyDegenerate inputs (empty strings, markdown-only content, bold markers) produce valid output without errors: empty input yields empty string, formatting is stripped, and multiple sentence endings use the first.Summary extraction runs on every pattern in the dataset; an unhandled edge case would crash the entire documentation generation pipeline.
RuleInvariantRationale
Orchestrator coordinates full documentation generation pipelineOrchestrator merges TypeScript and Gherkin patterns, handles conflicts, and produces requested document types.Without centralized orchestration, consumers would wire pipelines independently, leading to inconsistent merging and silent data loss.
Registry manages generator registration and retrievalRegistry prevents duplicate names, returns undefined for unknown generators, and lists available generators alphabetically.Duplicate registrations would silently overwrite generators, causing unpredictable output depending on registration order.
CodecBasedGenerator adapts codecs to generator interfaceGenerator delegates to underlying codec for transformation. Missing MasterDataset produces descriptive error.If the adapter silently proceeds without a MasterDataset, codecs receive undefined input and produce corrupt or empty documents.
Orchestrator supports PR changes generation optionsPR changes can filter by git diff, changed files, or release version.Without filtering, PR documentation would include all patterns in the dataset, making change review noisy and hiding actual modifications.
RuleInvariantRationale
Registry manages generator registration and retrievalEach generator name is unique within the registry; duplicate registration is rejected and lookup of unknown names returns undefined.Allowing duplicate names would silently overwrite an existing generator, causing previously registered behavior to disappear without warning.
RuleInvariantRationale
Repository prefixes are stripped from implementation pathsImplementation file paths must not contain repository-level prefixes like “libar-platform/” or “monorepo/”.Generated links are relative to the output directory; repository prefixes produce broken paths.
All implementation links in a pattern are normalizedEvery implementation link in a pattern document must have its path normalized, regardless of how many implementations exist.A single un-normalized link in a multi-implementation pattern produces a broken reference that undermines trust in the entire generated document.
normalizeImplPath strips known prefixesnormalizeImplPath removes only recognized repository prefixes from the start of a path and leaves all other path segments unchanged.Over-stripping would corrupt legitimate path segments that happen to match a prefix name, producing silent broken links.
RuleInvariantRationale
Layered diagrams group patterns by arch-layerEach distinct arch-layer value must produce exactly one Mermaid subgraph containing all patterns with that layer.Without layer subgraphs, the Clean Architecture boundary between domain, application, and infrastructure is not visually enforced.
Layer order is domain to infrastructure (top to bottom)Layer subgraphs must be rendered in Clean Architecture order: domain first, then application, then infrastructure.The visual order reflects the dependency rule where outer layers depend on inner layers; reversing it would misrepresent the architecture.
Context labels included in layered diagram nodesEach node in a layered diagram must include its bounded context name as a label, since context is not conveyed by subgraph grouping.Layered diagrams group by layer, not context, so the context label is the only way to identify which bounded context a node belongs to.
Patterns without layer go to Other subgraphPatterns that have arch-role or arch-context but no arch-layer must be placed in an “Other” subgraph, never omitted from the diagram.Omitting unlayered patterns would silently hide architectural components; the “Other” group makes their missing classification visible.
Layered diagram includes summary sectionThe generated layered diagram document must include an Overview section with annotated source file count.Without summary counts, readers cannot assess diagram completeness or detect missing annotated sources.
RuleInvariantRationale
Each relationship type has a distinct arrow styleEach relationship type (uses, depends-on, implements, extends) must render with a unique, visually distinguishable arrow style.Identical arrow styles would make relationship semantics indistinguishable in generated diagrams.
Pattern names are sanitized for Mermaid node IDsPattern names must be transformed into valid Mermaid node IDs by replacing special characters (dots, hyphens, spaces) with underscores.Unsanitized names containing dots, hyphens, or spaces produce invalid Mermaid syntax that fails to render.
All relationship types appear in single graphThe generated Mermaid graph must combine all relationship types (uses, depends-on, implements, extends) into a single top-down graph.Splitting relationship types into separate graphs would fragment the dependency picture and hide cross-type interactions.
RuleInvariantRationale
Orchestrator delegates pipeline to factorygenerateDocumentation() calls buildMasterDataset() for the scan-extract-merge-transform sequence. It does not import from scanner/ or extractor/ for pipeline orchestration. Direct imports are permitted only for types used in GenerateResult (e.g., ExtractedPattern).The orchestrator is the original host of the inline pipeline. After this migration, the pipeline factory is the sole definition of the 8-step sequence. Any future changes to pipeline steps (adding caching, parallel scanning, incremental extraction) happen in one place and all consumers benefit.
mergePatterns lives in pipeline moduleThe mergePatterns() function lives in src/generators/pipeline/merge-patterns.ts as a pipeline step. It is not defined in consumer code (orchestrator or CLI files).mergePatterns is step 5 of the 8-step pipeline. It was defined in orchestrator.ts for historical reasons (the orchestrator was the first pipeline host). Now that the pipeline factory exists, the function belongs alongside other pipeline steps (scan, extract, transform). The public API re-export in generators/index.ts preserves backward compatibility.
Factory provides structured warnings for all consumersPipelineResult.warnings contains typed warning objects with type, message, optional count, and optional details (file, line, column, message). Consumers that need granular diagnostics (orchestrator) use the full structure. Consumers that need simple messages (process-api) read .message only.The orchestrator collects scan errors, skipped directives, extraction errors, and Gherkin parse errors as structured GenerationWarning objects. The factory must provide equivalent structure to eliminate the orchestrator’s need to run the pipeline directly. The PipelineWarning type is structurally similar to GenerationWarning to minimize mapping complexity.
Pipeline factory supports partial success modeWhen failOnScanErrors is false, the factory captures scan errors and extraction errors as warnings and continues with successfully processed files. When true (default), the factory returns Result.err on the first scan failure.The orchestrator treats scan errors as non-fatal warnings — documentation generation should succeed for all scannable files even if some files have syntax errors. The process-api treats scan errors as fatal because the query layer requires a complete dataset. The factory must support both strategies via configuration.
End-to-end verification confirms behavioral equivalenceAfter migration, all CLI commands and doc generation produce identical output to pre-refactor behavior.The migration must not change observable behavior for any consumer. Full verification confirms the factory migration is a pure refactor.
RuleInvariantRationale
Document structure includes progress tracking and category navigationEvery decoded document must contain a title, purpose, Progress section with status counts, and category navigation regardless of dataset size.The PATTERNS.md is the primary entry point for understanding project scope; incomplete structure would leave consumers without context.
Pattern table presents all patterns sorted by status then nameThe pattern table must include every pattern in the dataset with columns for Pattern, Category, Status, and Description, sorted by status priority (completed first) then alphabetically by name.Consistent ordering allows quick scanning of project progress; completed patterns at top confirm done work, while roadmap items at bottom show remaining scope.
Category sections group patterns by domainEach category in the dataset must produce an H3 section listing its patterns, and the filterCategories option must restrict output to only the specified categories.Without category grouping, consumers must scan the entire flat pattern list to find domain-relevant patterns; filtering avoids noise in focused documentation.
Dependency graph visualizes pattern relationshipsA Mermaid dependency graph must be included when pattern relationships exist and the includeDependencyGraph option is not disabled; it must be omitted when no relationships exist or when explicitly disabled.Dependency relationships are invisible in flat pattern lists; the graph reveals implementation ordering and coupling that affects planning decisions.
Detail file generation creates per-pattern pagesWhen generateDetailFiles is enabled, each pattern must produce an individual markdown file at patterns/{slug}.md containing an Overview section; when disabled, no additional files must be generated.Detail files enable deep-linking into specific patterns from the main registry while keeping the index document scannable.
RuleInvariantRationale
PlanningChecklistCodec prepares for implementation sessionsThe checklist must include pre-planning questions, definition of done with deliverables, and dependency status for all actionable phases.Implementation sessions fail without upfront preparation — the checklist surfaces blockers before work begins.
SessionPlanCodec generates implementation plansThe plan must include status summary, implementation approach from use cases, deliverables with status, and acceptance criteria from scenarios.A structured implementation plan ensures all deliverables and acceptance criteria are visible before coding starts.
SessionFindingsCodec captures retrospective discoveriesFindings must be categorized into gaps, improvements, risks, and learnings with per-type counts in the summary.Retrospective findings drive continuous improvement — categorization enables prioritized follow-up across sessions.
RuleInvariantRationale
POC decision document is parsed correctlyThe real POC decision document (Process Guard) must be parseable by the codec, extracting all source mappings with their extraction types.Integration testing against the actual POC document validates that the codec works with real-world content, not just synthetic test data.
Self-references extract content from POC decisionTHIS DECISION self-references in the POC document must successfully extract Context rules, Decision rules, and DocStrings from the document itself.Self-references are the most common extraction type in decision docs — they must work correctly for the POC to demonstrate the end-to-end pipeline.
TypeScript shapes are extracted from real filesThe source mapper must successfully extract type shapes and patterns from real TypeScript source files referenced in the POC document.TypeScript extraction is the primary mechanism for pulling implementation details into decision docs — it must work with actual project files.
Behavior spec content is extracted correctlyThe source mapper must successfully extract Rule blocks and ScenarioOutline Examples from real Gherkin feature files referenced in the POC document.Behavior spec extraction bridges decision documents to executable specifications — incorrect extraction would misrepresent the verified behavior.
JSDoc sections are extracted from CLI filesThe source mapper must successfully extract JSDoc comment sections from real TypeScript CLI files referenced in the POC document.CLI documentation often lives in JSDoc comments — extracting them into decision docs avoids duplicating CLI usage information manually.
All source mappings execute successfullyAll source mappings defined in the POC decision document must execute without errors, producing non-empty extraction results.End-to-end execution validates that all extraction types work with real files — a single failing mapping would produce incomplete decision documentation.
Compact output generates correctlyThe compact output for the POC document must generate successfully and contain all essential sections defined by the compact format.Compact output is the AI-facing artifact — verifying it against the real POC ensures the format serves its purpose of providing concise decision context.
Detailed output generates correctlyThe detailed output for the POC document must generate successfully and contain all sections including full content from source mappings.Detailed output is the human-facing artifact — verifying it against the real POC ensures no content is lost in the generation pipeline.
Generated output matches quality expectationsThe generated output structure must match the expected target format, with complete validation rules and properly structured sections.Quality assertions catch regressions in output formatting — structural drift in generated documents would degrade their usefulness as references.
RuleInvariantRationale
PrChangesCodec generates review checklist when includeReviewChecklist is enabledWhen includeReviewChecklist is enabled, the codec must generate a “Review Checklist” section with standard items and context-sensitive items based on pattern state (completed, active, dependencies, deliverables). When disabled, no checklist appears.A context-sensitive checklist prevents reviewers from missing state-specific concerns (e.g., verifying completed patterns still work, or that dependencies are satisfied) that a static checklist would not cover.
PrChangesCodec generates dependencies section when includeDependencies is enabledWhen includeDependencies is enabled and patterns have dependency relationships, the codec must render a “Dependencies” section with “Depends On” and “Enables” subsections. When no dependencies exist or the option is disabled, the section is omitted.Dependency visibility in PR reviews prevents merging changes that break upstream or downstream patterns, which would otherwise only surface during integration.
PrChangesCodec filters patterns by changedFilesWhen changedFiles filter is set, only patterns whose source files match (including partial directory path matches) are included in the output.Filtering by changed files scopes the PR document to only the patterns actually touched, preventing reviewers from wading through unrelated patterns.
PrChangesCodec filters patterns by releaseFilterWhen releaseFilter is set, only patterns with deliverables matching the specified release version are included.Release filtering isolates the patterns scheduled for a specific version, enabling targeted release reviews without noise from other versions’ deliverables.
PrChangesCodec uses OR logic for combined filtersWhen both changedFiles and releaseFilter are set, patterns matching either criterion are included (OR logic), and patterns matching both criteria appear only once (no duplicates).OR logic maximizes PR coverage — a change may affect files not yet assigned to a release, or a release may include patterns from unchanged files.
PrChangesCodec only includes active and completed patternsThe codec must exclude roadmap and deferred patterns, including only active and completed patterns in the PR changes output.PR changes reflect work that is in progress or done — roadmap and deferred patterns have no code changes to review.
RuleInvariantRationale
PrChangesCodec handles empty results gracefullyWhen no patterns match the applied filters, the codec must produce a valid document with a “No Changes” section describing which filters were active.Reviewers need to distinguish “nothing matched” from “codec error” and understand why no patterns appear.
PrChangesCodec generates summary with filter informationEvery PR changes document must contain a Summary section with pattern counts and active filter information.Without a summary, reviewers must scan the entire document to understand the scope and filtering context of the PR changes.
PrChangesCodec groups changes by phase when sortBy is “phase”When sortBy is “phase” (the default), patterns must be grouped under phase headings in ascending phase order.Phase grouping aligns PR changes with the delivery roadmap, letting reviewers verify that changes belong to the expected implementation phase.
PrChangesCodec groups changes by priority when sortBy is “priority”When sortBy is “priority”, patterns must be grouped under High/Medium/Low priority headings with correct pattern assignment.Priority grouping lets reviewers focus on high-impact changes first, ensuring critical patterns receive the most review attention.
PrChangesCodec shows flat list when sortBy is “workflow”When sortBy is “workflow”, patterns must be rendered as a flat list without phase or priority grouping.Workflow sorting presents patterns in review order without structural grouping, suited for quick PR reviews.
PrChangesCodec renders pattern details with metadata and descriptionEach pattern entry must include a metadata table (status, phase, business value when available) and description text.Metadata and description provide the context reviewers need to evaluate whether a pattern’s implementation aligns with its stated purpose and delivery status.
PrChangesCodec renders deliverables when includeDeliverables is enabledDeliverables are only rendered when includeDeliverables is enabled, and when releaseFilter is set, only deliverables matching that release are shown.Deliverables add bulk to the PR document; gating them behind a flag keeps default output concise, while release filtering prevents reviewers from seeing unrelated work items.
PrChangesCodec renders acceptance criteria from scenariosWhen patterns have associated scenarios, the codec must render an “Acceptance Criteria” section containing scenario names and step lists.Acceptance criteria give reviewers a concrete checklist to verify that the PR’s implementation satisfies the behavioral requirements defined in the spec.
PrChangesCodec renders business rules from Gherkin Rule keywordWhen patterns have Gherkin Rule blocks, the codec must render a “Business Rules” section containing rule names and verification information.Business rules surface domain invariants directly in the PR review, ensuring reviewers can verify that implementation changes respect the documented constraints.
RuleInvariantRationale
Release version filtering controls which phases appear in outputOnly phases with deliverables matching the releaseFilter are included; roadmap phases are always excluded.Including unrelated releases or unstarted roadmap items in a PR description misleads reviewers about the scope of actual changes.
Patterns are grouped by phase number in the outputEach phase number produces a separate heading section in the generated output.Without phase grouping, reviewers cannot distinguish which changes belong to which delivery phase, making incremental review impossible.
Summary statistics provide a high-level overview of the PRSummary section always shows pattern counts and release tag when a releaseFilter is active.Without a summary, reviewers must read the entire document to understand the PR’s scope; the release tag anchors the summary to a specific version.
Deliverables are displayed inline with their parent patternsWhen includeDeliverables is enabled, each pattern lists its deliverables with name, status, and release tag.Hiding deliverables forces reviewers to cross-reference feature files to verify completion; inline display makes review self-contained.
Review checklist includes standard code quality verification itemsReview checklist always includes code conventions, tests, documentation, and completed pattern verification items.Omitting the checklist means quality gates depend on reviewer memory; a consistent checklist ensures no standard verification step is skipped.
Dependencies section shows inter-pattern relationshipsDependencies section surfaces both what patterns enable and what they depend on.Hidden dependencies cause merge-order mistakes and broken builds; surfacing them in the PR lets reviewers verify prerequisite work is complete.
Business value can be included or excluded from pattern metadataBusiness value display is controlled by the includeBusinessValue option.Not all consumers need business value context; making it opt-in keeps the default output concise for technical reviewers.
Output can be sorted by phase number or prioritySorting is deterministic and respects the configured sortBy option.Non-deterministic ordering produces diff noise between regenerations, making it impossible to tell if content actually changed.
Edge cases produce graceful outputThe generator handles missing phases, missing deliverables, and missing phase numbers without errors.Crashing on incomplete data prevents PR generation entirely; graceful degradation ensures output is always available even with partial inputs.
Deliverable-level filtering shows only matching deliverables within a phaseWhen a phase contains deliverables with different release tags, only those matching the releaseFilter are shown.Showing all deliverables regardless of release tag pollutes the PR with unrelated work, obscuring what actually shipped in the target release.
RuleInvariantRationale
Orchestrator supports PR changes generation optionsPR changes output includes only patterns matching the changed files list, the release version filter, or both (OR logic when combined).PR-scoped documentation must reflect exactly what changed, avoiding noise from unrelated patterns.
RuleInvariantRationale
PRD generator discovers implementations from relationship indexWhen generating PRD for pattern X, the generator queries the relationship index for all files where implements === X. No explicit listing in the spec file is required.The @libar-docs-implements tag creates a backward link from code to spec. The relationship index aggregates these. PRD generation simply queries the index rather than scanning directories.
Implementation metadata appears in dedicated PRD sectionThe PRD output includes a ”## Implementations” section listing all files that implement the pattern. Each file shows its uses, usedBy, and usecase metadata in a consistent format.Developers reading PRDs benefit from seeing the implementation landscape alongside requirements, without cross-referencing code files.
Patterns without implementations render cleanlyIf no files have @libar-docs-implements:X for pattern X, the ”## Implementations” section is omitted (not rendered as empty).Planned patterns may not have implementations yet. Empty sections add noise without value.
RuleInvariantRationale
Implementation files appear in pattern docs via @libar-docs-implementsAny TypeScript file with a matching @libar-docs-implements tag must appear in the pattern document’s Implementations section with a working file link.Implementation discovery relies on tag-based linking — missing entries break traceability between specs and code.
Multiple implementations are listed alphabeticallyWhen multiple files implement the same pattern, they must be listed in ascending file path order.Deterministic ordering ensures stable document output across regeneration runs.
Patterns without implementations omit the sectionThe Implementations heading must not appear in pattern documents when no implementing files exist.Rendering an empty Implementations section misleads readers into thinking implementations were expected but are missing, rather than simply not applicable.
Implementation references use relative file linksImplementation file links must be relative paths starting from the patterns output directory.Absolute paths break when documentation is viewed from different locations; relative paths ensure portability.
RuleInvariantRationale
Empty datasets produce fallback contentA codec must always produce a valid document, even when no matching content exists in the dataset.Consumers rely on a consistent document structure; a missing or null document would cause rendering failures downstream.
Convention content is rendered as sectionsConvention-tagged patterns must render as distinct headed sections with their rule names, invariants, and tables preserved.Conventions define project-wide constraints; losing their structure in generated docs would make them unenforceable and undiscoverable.
Detail level controls output densityEach detail level (summary, standard, detailed) must produce a deterministic subset of content, with summary being the most restrictive.AI session contexts have strict token budgets; uncontrolled output density wastes context window and degrades session quality.
Behavior sections are rendered from category-matching patternsOnly patterns whose category matches the configured behavior tags may appear in the Behavior Specifications section.Mixing unrelated categories into a single behavior section would produce misleading documentation that conflates distinct concerns.
Shape sources are extracted from matching patternsOnly shapes from patterns whose file path matches the configured shapeSources glob may appear in the API Types section.Including shapes from unrelated source paths would pollute the API Types section with irrelevant type definitions, breaking the scoped documentation contract.
Convention and behavior content compose in a single documentConvention and behavior content must coexist in the same RenderableDocument when both are present in the dataset.Splitting conventions and behaviors into separate documents would force consumers to cross-reference multiple files, losing the unified view of a product area.
Composition order follows AD-5: conventions then shapes then behaviorsDocument sections must follow the canonical order: conventions, then API types (shapes), then behavior specifications.AD-5 establishes a consistent reading flow (rules, then types, then specs); violating this order would confuse readers who expect a stable document structure.
Convention code examples render as mermaid blocksMermaid diagram content in conventions must render as fenced mermaid blocks, and must be excluded at summary detail level.Mermaid diagrams are visual aids that require rendering support; emitting them as plain text would produce unreadable output, and including them in summaries wastes token budget.
RuleInvariantRationale
Standard detail level includes narrative but omits rationaleStandard detail level renders narrative prose for convention patterns but excludes rationale sections, reserving rationale for the detailed level only.Progressive disclosure prevents information overload at the standard level while ensuring readers who need deeper justification can access it at the detailed level.
Deep behavior rendering with structured annotationsBehavior patterns render structured rule annotations (invariant, rationale, verified-by) at detailed level, invariant-only at standard level, and a truncated table at summary level.Structured annotations are the primary mechanism for surfacing business rules from Gherkin sources; inconsistent rendering across detail levels would produce misleading or incomplete documentation.
Shape JSDoc prose renders at standard and detailed levelsShape patterns with JSDoc prose include that prose in rendered code blocks at standard and detailed levels. Shapes without JSDoc render code blocks only.JSDoc prose provides essential context for API types; omitting it would force readers to open source files to understand a shape’s purpose, undermining the generated documentation’s self-sufficiency.
Shape sections render param returns and throws documentationFunction shapes render parameter, returns, and throws documentation at detailed level. Standard level renders parameter tables but omits throws. Shapes without param docs skip the parameter table entirely.Throws documentation is diagnostic detail that clutters standard output; separating it into detailed level keeps standard output focused on the function’s contract while preserving full error documentation for consumers who need it.
Collapsible blocks wrap behavior rules for progressive disclosureWhen a behavior pattern has 3 or more rules and detail level is not summary, each rule’s content is wrapped in a collapsible block with the rule name and scenario count in the summary. Patterns with fewer than 3 rules render rules flat. Summary level never produces collapsible blocks.Behavior sections with many rules produce substantial content at detailed level. Collapsible blocks enable progressive disclosure so readers can expand only the rules they need.
Link-out blocks provide source file cross-referencesAt standard and detailed levels, each behavior pattern includes a link-out block referencing its source file path. At summary level, link-out blocks are omitted for compact output.Cross-reference links enable readers to navigate from generated documentation to the annotated source files, closing the loop between generated docs and the single source of truth.
Include tags route cross-cutting content into reference documentsPatterns with matching include tags appear alongside category-selected patterns in the behavior section. The merging is additive (OR semantics).Cross-cutting patterns (e.g., shared utilities, common validators) belong in multiple reference documents; without include-tag routing, these patterns would only appear in their home category, leaving dependent documents incomplete.
RuleInvariantRationale
Scoped diagrams are generated from diagramScope configDiagram content is determined exclusively by diagramScope filters (archContext, include, archLayer, patterns), and filters compose via OR — a pattern matching any single filter appears in the diagram.Without filter-driven scoping, diagrams would include all patterns regardless of relevance, producing unreadable visualizations that obscure architectural boundaries.
Multiple diagram scopes produce multiple mermaid blocksEach entry in the diagramScopes array produces an independent Mermaid block with its own title and direction, and legacy singular diagramScope remains supported as a fallback.Product areas require multiple architectural views (e.g., system overview and data flow) from a single configuration, and breaking backward compatibility with the singular diagramScope would silently remove diagrams from existing consumers.
RuleInvariantRationale
Diagram type controls Mermaid output formatThe diagramType field on DiagramScope selects the Mermaid output format. Supported types are graph (flowchart, default), sequenceDiagram, and stateDiagram-v2. Each type produces syntactically valid Mermaid output with type-appropriate node and edge rendering.Flowcharts cannot naturally express event flows (sequence), FSM visualization (state), or temporal ordering. Multiple diagram types unlock richer architectural documentation from the same relationship data.
Edge labels and custom node shapes enrich diagram readabilityRelationship edges display labels describing the relationship type (uses, depends on, implements, extends). Edge labels are enabled by default and can be disabled via showEdgeLabels false. Node shapes in flowchart diagrams vary by archRole value using Mermaid shape syntax.Unlabeled edges are ambiguous without consulting a legend. Custom node shapes make archRole visually distinguishable without color reliance, improving accessibility and scanability.
RuleInvariantRationale
Deep behavior rendering replaces shallow truncationAt standard and detailed levels, behavior sections render full rule descriptions with parsed invariant, rationale, and verified-by content. At summary level, the 120-character truncation is preserved for compact output. Behavior rendering reuses parseBusinessRuleAnnotations from the convention extractor rather than reimplementing structured content parsing.The current 120-character truncation discards invariants, rationale, and verified-by content that is already extracted and available in BusinessRule.description. Reference documents need the full rule content to serve as authoritative documentation. The convention extractor already parses this structured content via parseBusinessRuleAnnotations — the behavior builder should delegate to the same function.
Shape sections include JSDoc prose and property documentationAt standard level, shape code blocks are preceded by JSDoc prose when available. At detailed level, interface shapes additionally render a property documentation table. At summary level, only the type-kind table appears. Shapes without JSDoc render code blocks without preceding paragraph.JSDoc on shapes contains design rationale and usage guidance that is already extracted by the shape extractor. Gating it behind detailed level wastes the data at the most common detail level (standard). The fix is a single condition change: reference.ts line 342 from detailLevel === ‘detailed’ to detailLevel !== ‘summary’.
Diagram scope supports archLayer filtering and multiple diagram typesDiagramScope gains optional archLayer and diagramType fields. The archLayer filter selects patterns by their architectural layer (domain, application, infrastructure) and composes with archContext and archView via OR logic, consistent with existing filter dimensions. The diagramType field controls Mermaid output format: graph (default), sequenceDiagram, stateDiagram-v2, C4Context, classDiagram. Each diagram type has its own node and edge syntax appropriate to the Mermaid specification.Layer-based views are fundamental to layered architecture documentation — a developer reviewing the domain layer wants only deciders and value objects, not infrastructure adapters. Multiple diagram types unlock event flow documentation (sequence), FSM visualization (state), architecture overview (C4), and type hierarchy views (class) that flowcharts cannot express naturally.
Every renderable block type appears in the showcase documentThe generated REFERENCE-SAMPLE.md at detailed level must contain at least one instance of each of the 9 block types: heading, paragraph, separator, table, list, code, mermaid, collapsible, link-out. At summary level, progressive disclosure blocks (collapsible, link-out) are omitted for compact output.The sample document is the integration test for the reference codec. If any block type is missing, there is no automated verification that the codec can produce it. Coverage of all 9 types validates the full rendering pipeline from MasterDataset through codec through renderer.
Edge labels and custom node shapes enrich diagram readabilityRelationship edges in scoped diagrams display labels describing the relationship semantics (uses, dependsOn, implements, extends). Edge labels are enabled by default and can be disabled via a showEdgeLabels option for compact diagrams. Node shapes vary by archRole value — services use rounded rectangles, bounded contexts use subgraphs, projections use cylinders, and sagas use hexagons.Unlabeled edges are ambiguous — a reader seeing a solid arrow cannot distinguish “uses” from “implements” without consulting an edge style legend. Custom node shapes leverage Mermaid’s shape vocabulary to make archRole visually distinguishable without color reliance, improving accessibility.
Extraction pipeline surfaces complete API documentationExportInfo.signature shows full function parameter types and return type instead of the placeholder value. JSDoc param, returns, and throws tags are extracted and stored on ExtractedShape. Property-level JSDoc preserves full multi-line content without first-line truncation. Auto-shape discovery mode extracts all exported types from files matching shapeSources globs without requiring explicit extract-shapes annotations.Function signatures are the most valuable API surface — they show what a pattern exports without source navigation. The ExportInfo.signature field already exists in the schema but holds a lossy placeholder. The fix is approximately 15 lines in ast-parser.ts: threading sourceCode into extractFromDeclaration and slicing parameter ranges. Auto-shape discovery eliminates manual annotation burden for files that match shapeSources globs.
Infrastructure enables flexible document composition and AI-optimized outputCompositeCodec assembles reference documents from multiple codec outputs by concatenating RenderableDocument sections. The renderToClaudeContext renderer produces token-efficient output using section markers optimized for LLM consumption. The Gherkin tag extractor uses TagRegistry metadata instead of hardcoded if/else branches, making new tag addition a zero-code-change operation. Convention content can be extracted from TypeScript JSDoc blocks containing structured Invariant/Rationale annotations, not only from Gherkin Rule blocks.CompositeCodec enables referenceDocConfigs to include content from any codec, not just the current 4 sources. The renderToClaudeContext renderer unifies two formatting paths (codec-based markdown vs hand-written markers in context-formatter.ts). Data-driven tag extraction cuts the maintenance burden of the 40-branch if/else in gherkin-ast-parser.ts roughly in half. TypeScript convention extraction enables self-documenting business rules in implementation files alongside their code.
RuleInvariantRationale
Registration produces the correct number of generatorsEach reference config produces exactly 2 generators (detailed + summary), plus meta-generators for product-area and non-product-area routing.The count is deterministic from config — any mismatch indicates a registration bug that would silently drop generated documents.
Product area configs produce a separate meta-generatorConfigs with productArea set route to “product-area-docs” meta-generator; configs without route to “reference-docs”.Product area docs are rendered into per-area subdirectories while standalone references go to the root output.
Generator naming follows kebab-case conventionDetailed generators end in “-reference” and summary generators end in “-reference-claude”.Consistent naming enables programmatic discovery and distinguishes human-readable from AI-optimized outputs.
Generator execution produces markdown outputEvery registered generator must produce at least one non-empty output file when given matching data.A generator that produces empty output wastes a pipeline slot and creates confusion when expected docs are missing.
RuleInvariantRationale
Priority-based sorting surfaces critical work firstPhases with higher priority always appear before lower-priority phases when sorting by priority.Without priority sorting, critical work gets buried under low-priority items, delaying urgent deliverables.
Effort parsing converts duration strings to comparable hoursEffort strings must be parsed to a common unit (hours) for accurate sorting across different time scales.Comparing raw strings like “2h” and “3d” lexicographically produces incorrect ordering; normalization to hours ensures consistent comparison.
Quarter grouping organizes planned work into time-based bucketsPhases with a quarter tag are grouped under their quarter heading; phases without a quarter appear under Unscheduled.Flat lists obscure time-based planning; grouping by quarter lets planners see what is committed per period and what remains unscheduled.
Priority grouping organizes phases by urgency levelPhases are grouped under their priority heading; phases without priority appear under Unprioritized.Mixing priority levels in a flat list forces readers to visually scan for urgency; grouping by priority makes triage immediate.
Progressive disclosure prevents information overload in large backlogsWhen the backlog exceeds maxNextActionable, only the top N phases are shown with a link or count for the remainder.Displaying hundreds of phases in the summary overwhelms planners; progressive disclosure keeps the summary scannable while preserving access to the full backlog.
Edge cases are handled gracefullyEmpty or fully-blocked backlogs produce meaningful output instead of errors or blank sections.Blank or errored output when the backlog is empty confuses users into thinking the generator is broken rather than reflecting a genuinely empty state.
Default behavior preserves backward compatibilityWithout explicit sortBy or groupPlannedBy options, phases are sorted by phase number in a flat list.Changing default behavior would break existing consumers that rely on phase-number ordering without specifying options.
RuleInvariantRationale
Summary totals equal sum of phase table rowsThe summary Active and Total Remaining counts must exactly equal the sum of the corresponding counts across all phase table rows.A mismatch between summary and phase-level totals indicates patterns are being double-counted or dropped.
Patterns without phases appear in Backlog rowPatterns that have no assigned phase must be grouped into a “Backlog” row in the phase table rather than being omitted.Unphased patterns are still remaining work; omitting them would undercount the total.
Patterns without patternName are counted using idPattern counting must use pattern.id as the identifier, never patternName, so that patterns with undefined names are neither double-counted nor omitted.patternName is optional; relying on it for counting would miss unnamed patterns entirely.
All phases with incomplete patterns are shownThe phase table must include every phase that contains at least one incomplete pattern, and phases with only completed patterns must be excluded.Showing fully completed phases inflates the remaining work view, while omitting phases with incomplete patterns hides outstanding work.
RuleInvariantRationale
Document metadata renders as frontmatter before sectionsTitle always renders as H1, purpose and detail level render as bold key-value pairs separated by horizontal rule.Consistent frontmatter structure allows downstream tooling and readers to reliably locate the document title and metadata without parsing the full body.
Headings render at correct markdown levels with clampingHeading levels are clamped to the valid range 1-6 regardless of input value.Markdown only supports heading levels 1-6; unclamped values would produce invalid syntax that renders as plain text in all markdown processors.
Paragraphs and separators render as plain text and horizontal rulesParagraph content passes through unmodified, including special markdown characters. Separators render as horizontal rules.The renderer is a dumb printer; altering paragraph content would break codec-controlled formatting and violate the separation between codec logic and rendering.
Tables render with headers, alignment, and cell escapingTables must escape pipe characters, convert newlines to line breaks, and pad short rows to match column count.Unescaped pipes corrupt table column boundaries, raw newlines break row parsing, and short rows cause column misalignment in every markdown renderer.
Lists render in unordered, ordered, checkbox, and nested formatsList type determines prefix: dash for unordered, numbered for ordered, checkbox syntax for checked items. Nesting adds two-space indentation per level.Incorrect prefixes or indentation levels cause markdown parsers to break list continuity, rendering nested items as separate top-level lists or plain text.
RuleInvariantRationale
Code blocks and mermaid diagrams render with fenced syntaxCode blocks use triple backtick fencing with optional language hint. Mermaid blocks use mermaid as the language hint.Inconsistent fencing breaks syntax highlighting in GitHub/IDE markdown previews and prevents Mermaid renderers from detecting diagram blocks.
Collapsible blocks render as HTML details elementsSummary text is HTML-escaped to prevent injection. Collapsible content renders between details tags.Unescaped HTML in summary text enables XSS when generated markdown is rendered in browsers; malformed details tags break progressive disclosure in documentation.
Link-out blocks render as markdown links with URL encodingLink paths with spaces are percent-encoded for valid URLs.Unencoded spaces produce broken links in markdown renderers, making cross-document navigation fail silently for files with spaces in their paths.
Multi-file documents produce correct output file collectionsOutput file count equals 1 (main) plus additional file count. The first output file always uses the provided base path.A mismatch between expected and actual file count causes the orchestrator to write orphaned files or miss outputs, corrupting the generated documentation directory.
Complex documents render all block types in sequenceMultiple block types in a single document render in order without interference.Block ordering reflects the codec’s semantic structure; out-of-order or swallowed blocks would produce misleading documentation that diverges from the source of truth.
Claude context renderer produces compact AI-optimized outputClaude context replaces markdown syntax with section markers, omits visual-only blocks (mermaid, separators), flattens collapsible content, and produces shorter output than markdown.LLM context windows are token-limited; visual-only blocks waste tokens without adding semantic value, and verbose markdown syntax inflates context size unnecessarily.
Claude MD module renderer produces modular-claude-md compatible outputTitle renders as H3 (offset +2), section headings are offset by +2 clamped at H6, frontmatter is omitted, mermaid blocks are omitted, link-out blocks are omitted, and collapsible blocks are flattened to headings.The modular-claude-md system manages CLAUDE.md as composable H3-rooted modules. Generating incompatible formats (like section markers) produces orphaned files that are never consumed.
RuleInvariantRationale
ChangelogCodec follows Keep a Changelog formatReleases must be sorted by semver descending, unreleased patterns grouped under “[Unreleased]”, and change types follow the standard order (Added, Changed, Deprecated, Removed, Fixed, Security).Keep a Changelog is an industry standard format — following it ensures the output is immediately familiar to developers.
TraceabilityCodec maps timeline patterns to behavior testsCoverage statistics must show total timeline phases, those with behavior tests, those missing, and a percentage. Gaps must be surfaced prominently.Traceability ensures every planned pattern has executable verification — gaps represent unverified claims about system behavior.
OverviewCodec provides project architecture summaryThe overview must include architecture sections from overview-tagged patterns, pattern summary with progress percentage, and timeline summary with phase counts.The architecture overview is the primary entry point for understanding the project — it must provide a complete picture at a glance.
RuleInvariantRationale
RequirementsDocumentCodec generates PRD-style documentation from patternsRequirementsDocumentCodec transforms MasterDataset patterns into a PRD-style document with flexible grouping (product area, user role, or phase), optional detail file generation, and business value rendering.Flexible grouping lets stakeholders view requirements through their preferred lens (area, role, or phase), and detail files provide deep-dive context without bloating the summary document.
AdrDocumentCodec documents architecture decisionsAdrDocumentCodec transforms MasterDataset ADR patterns into an architecture decision record document with status tracking, category/phase/date grouping, supersession relationships, and optional detail file generation.Architecture decisions lose value without status tracking and supersession chains; without them, teams act on outdated decisions and cannot trace why a previous approach was abandoned.
RuleInvariantRationale
DocString parsing handles edge casesDocString parsing must gracefully handle empty input, missing language hints, unclosed delimiters, and non-LF line endings without throwing errors.Codecs receive uncontrolled user content from feature file descriptions; unhandled edge cases would crash document generation for the entire pipeline.
DataTable rendering produces valid markdownDataTable rendering must produce a well-formed table block for any number of rows, substituting empty strings for missing cell values.Malformed tables break markdown rendering and downstream tooling; missing cells would produce undefined values that corrupt table alignment.
Scenario content rendering respects optionsScenario rendering must honor the includeSteps option, producing step lists only when enabled, and must include embedded DataTables when present.Ignoring the includeSteps option would bloat summary views with unwanted detail, and dropping embedded DataTables would lose structured test data.
Business rule rendering handles descriptionsBusiness rule rendering must always include the rule name as a bold paragraph, and must parse descriptions for embedded DocStrings when present.Omitting the rule name makes rendered output unnavigable, and skipping DocString parsing would output raw delimiter syntax instead of formatted code blocks.
DocString content is dedented when parsedDocString code blocks must be dedented to remove common leading whitespace while preserving internal relative indentation, empty lines, and trimming trailing whitespace from each line.Without dedentation, code blocks inherit the Gherkin indentation level, rendering as deeply indented and unreadable in generated markdown.
RuleInvariantRationale
Validation runs before extraction in the pipelineValidation must complete and pass before extraction begins.Prevents wasted extraction work and provides clear fail-fast behavior.
Deduplication runs after extraction before assemblyDeduplication processes all extracted content before document assembly.All sources must be extracted to identify cross-source duplicates.
Warnings from all stages are collected and reportedWarnings from all pipeline stages are aggregated in the result.Users need visibility into non-fatal issues without blocking generation.
Pipeline provides actionable error messagesError messages include context and fix suggestions.Users should fix issues in one iteration without guessing.
Existing decision documents continue to workValid existing decision documents generate without new errors.Robustness improvements must be backward compatible.
RuleInvariantRationale
Basic arithmetic operations work correctlyArithmetic operations must return mathematically correct results for all valid inputs.Incorrect arithmetic results silently corrupt downstream calculations, making errors undetectable at their source. The calculator should perform standard math operations with correct results.
Division has special constraintsDivision operations must reject a zero divisor before execution.Unguarded division by zero causes runtime exceptions that crash the process instead of returning a recoverable error. Division by zero must be handled gracefully to prevent system errors.
RuleInvariantRationale
Scope filtering selects patterns by context, view, or nameA pattern matches a DiagramScope if ANY of three conditions hold: its name is in scope.patterns, its archContext is in scope.archContext, or any of its archView entries is in scope.archView. These dimensions are OR’d together — a pattern need only match one.Three filter dimensions cover different authoring workflows. Explicit names for ad-hoc documents, archContext for bounded context views, archView for cross-cutting architectural perspectives.
Neighbor discovery finds connected patterns outside scopePatterns connected to scope patterns via relationship edges (uses, dependsOn, implementsPatterns, extendsPattern) but NOT themselves in scope appear in a “Related” subgraph with dashed border styling.Scoped views need context. Showing only in-scope patterns without their dependencies loses critical relationship information. Neighbor patterns provide this context without cluttering the main view.
Multiple diagram scopes compose in sequenceWhen diagramScopes is an array, each scope produces its own Mermaid diagram section with independent title, direction, and pattern selection. At summary detail level, all diagrams are suppressed.A single reference document may need multiple architectural perspectives. Pipeline Overview shows both a codec transformation view (TB) and a pipeline data flow view (LR) in the same document.
RuleInvariantRationale
SessionContextCodec provides working context for AI sessionsSession context must include session status with active/completed/remaining counts, phase navigation for incomplete phases, and active work grouped by phase.AI agents need a compact, navigable view of current project state to make informed implementation decisions.
RemainingWorkCodec aggregates incomplete work by phaseRemaining work must show status counts, phase-grouped navigation, priority classification (in-progress/ready/blocked), and next actionable items.Remaining work visibility prevents scope blindness — knowing what’s left, what’s blocked, and what’s ready drives efficient session planning.
RuleInvariantRationale
Exact paths match without wildcardsA pattern without glob characters must match only the exact file path, character for character.Loose matching on non-glob patterns would silently include unintended files, causing incorrect shapes to appear in generated documentation.
Single-level globs match one directory levelA single * glob must match files only within the specified directory, never crossing directory boundaries.Crossing directory boundaries would violate standard glob semantics and pull in shapes from nested modules that belong to different product areas.
Recursive globs match any depthA ** glob must match files at any nesting depth below the specified prefix, while still respecting extension and prefix constraints.Recursive globs enable broad subtree selection for shape extraction; failing to respect prefix and extension constraints would leak unrelated shapes into the output.
Dataset shape extraction deduplicates by nameWhen multiple patterns match a source glob, the returned shapes must be deduplicated by name so each shape appears at most once.Duplicate shape names in generated documentation confuse readers and inflate type registries.
RuleInvariantRationale
Reference doc configs select shapes via shapeSelectorsshapeSelectors provides three selection modes: by source path + specific names, by group tag, or by source path alone.Multiple selection modes let reference docs curate precisely which shapes appear, preventing either over-inclusion of internal types or under-inclusion of public API surfaces.
RuleInvariantRationale
Extraction methods dispatch to correct handlersEach extraction method type (self-reference, TypeScript, Gherkin) must dispatch to the correct specialized handler based on the source file type or marker.Wrong dispatch would apply TypeScript extraction logic to Gherkin files (or vice versa), producing garbled or empty results.
Self-references extract from current decision documentTHIS DECISION self-references must extract content from the current decision document using rule descriptions, DocStrings, or full document access.Self-references avoid circular file reads — the document content is already in memory, so extraction is a lookup operation rather than a file I/O operation.
Multiple sources are aggregated in mapping orderWhen multiple source mappings target the same section, their extracted content must be aggregated in the order defined by the mapping table.Mapping order is intentional — authors structure their source tables to produce a logical reading flow, and reordering would break the narrative.
Missing files produce warnings without failingWhen a referenced source file does not exist, the mapper must produce a warning and continue processing remaining mappings rather than failing entirely.Partial extraction is more useful than total failure — a decision document with most sections populated and one warning is better than no document at all.
Empty extraction results produce info warningsWhen extraction succeeds but produces empty results (no matching shapes, no matching rules), an informational warning must be generated.Empty results often indicate stale source mappings pointing to renamed or removed content — warnings surface these issues before they reach generated output.
Extraction methods are normalized for dispatchExtraction method strings must be normalized to canonical forms before dispatch, with unrecognized methods producing a warning.Users write extraction methods in natural language — normalization bridges the gap between human-readable table entries and programmatic dispatch keys.
RuleInvariantRationale
Source files must exist and be readableAll source file paths in mappings must resolve to existing, readable files.Prevents extraction failures and provides clear error messages upfront.
Extraction methods must be valid and supportedExtraction methods must match a known method from the supported set.Invalid methods cannot extract content; suggest valid alternatives.
Extraction methods must be compatible with file typesMethod-file combinations must be compatible (e.g., TypeScript methods for .ts files).Incompatible combinations fail at extraction; catch early with clear guidance.
Source mapping tables must have required columnsTables must contain Section, Source File, and Extraction Method columns.Missing columns prevent extraction; alternative column names are mapped.
All validation errors are collected and returned togetherValidation collects all errors before returning, not just the first.Enables users to fix all issues in a single iteration.
RuleInvariantRationale
Tables in rule descriptions render exactly onceEach markdown table in a rule description appears exactly once in the rendered output, with no residual pipe characters in surrounding text.Without deduplication, tables extracted for formatting would also remain in the raw description text, producing duplicate output.
Multiple tables in description each render exactly onceWhen a rule description contains multiple markdown tables, each table renders as a separate formatted table block with no merging or duplication.Merging or dropping tables would lose distinct data structures that the author intentionally separated, corrupting the rendered documentation.
stripMarkdownTables removes table syntax from textstripMarkdownTables removes all pipe-delimited table syntax from input text while preserving all surrounding content unchanged.If table syntax is not stripped from the raw text, the same table data appears twice in the rendered output — once from the extracted table block and once as raw pipe characters in the description.
RuleInvariantRationale
Document metadata is correctly setThe taxonomy document must have the title “Taxonomy Reference”, a descriptive purpose string, and a detail level reflecting the generateDetailFiles option.Document metadata drives the table of contents and navigation in generated doc sites — incorrect metadata produces broken links and misleading titles.
Categories section is generated from TagRegistryThe categories section must render all categories from the configured TagRegistry as a table, with optional linkOut to detail files when progressive disclosure is enabled.Categories are the primary navigation structure in the taxonomy — missing categories leave developers unable to find the correct annotation tags.
Metadata tags can be grouped by domainWhen groupByDomain is enabled, metadata tags must be organized into domain-specific subsections; when disabled, a single flat table must be rendered.Domain grouping improves scannability for large tag sets (21 categories in ddd-es-cqrs) while flat mode is simpler for small presets (3 categories in generic).
Tags are classified into domains by hardcoded mappingTags must be classified into domains (Core, Relationship, Timeline, etc.) using a hardcoded mapping, with unrecognized tags placed in an “Other Tags” group.Domain classification is stable across releases — hardcoding prevents miscategorization from user config errors while the “Other” fallback handles future tag additions gracefully.
Optional sections can be disabled via codec optionsFormat Types, Presets, and Architecture sections must each be independently disableable via their respective codec option flags.Not all projects need all sections — disabling irrelevant sections reduces generated document size and prevents confusion from inapplicable content.
Detail files are generated for progressive disclosureWhen generateDetailFiles is enabled, the codec must produce additional detail files (one per domain group) alongside the main taxonomy document; when disabled, no additional files are created.Progressive disclosure keeps the main document scannable while providing deep-dive content in linked pages — monolithic documents become unwieldy for large tag sets.
Format types are documented with descriptions and examplesAll 6 format types must be documented with descriptions and usage examples in the generated taxonomy.Format types control how tag values are parsed — undocumented formats force developers to guess the correct syntax, leading to annotation errors.
RuleInvariantRationale
Business rules appear as a separate sectionEvery Rule block must produce a distinct Business Rule entry containing its description and associated scenarios.Without guaranteed capture, rule descriptions and rich content (DocStrings, DataTables) would be silently dropped from generated documentation. Rule descriptions provide context for why this business rule exists. You can include multiple paragraphs here. This is a second paragraph explaining edge cases or exceptions.
Multiple rules create multiple Business Rule entriesEach Rule keyword in a feature file must produce its own independent Business Rule entry in generated output.Merging rules into a single entry would collapse distinct business domains, making it impossible to trace scenarios back to their governing constraint. Each Rule keyword creates a separate entry in the Business Rules section. This helps organize complex features into logical business domains.
RuleInvariantRationale
RoadmapDocumentCodec groups patterns by phase with progress trackingThe roadmap must include overall progress with percentage, phase navigation table, and phase sections with pattern tables.The roadmap is the primary planning artifact — progress tracking at both project and phase level enables informed prioritization.
CompletedMilestonesCodec shows only completed patterns grouped by quarterOnly completed patterns appear, grouped by quarter with navigation, recent completions, and collapsible phase details.Milestone tracking provides a historical record of delivery — grouping by quarter aligns with typical reporting cadence.
CurrentWorkCodec shows only active patterns with deliverablesOnly active patterns appear with progress bars, deliverable tracking, and an all-active-patterns summary table.Current work focus eliminates noise from completed and planned items — teams need to see only what’s in flight.
RuleInvariantRationale
Parses Verified by annotations to extract scenario referencesScenario names in **Verified by:** are matched against actual scenarios in feature files. Unmatched references are reported as warnings.Verified by annotations create explicit traceability. Validating references ensures the traceability matrix reflects actual test coverage.
Generates Rule-to-Scenario traceability matrixEvery Rule appears in the matrix with its verification status. Scenarios are linked by name and file location.A matrix format enables quick scanning of coverage status and supports audit requirements for bidirectional traceability.
Detects and reports coverage gapsOrphan scenarios (not referenced by any Rule) and unverified rules are listed in dedicated sections.Coverage gaps indicate either missing traceability annotations or actual missing test coverage. Surfacing them enables remediation.
Supports filtering by phase and domainCLI flags allow filtering the matrix by phase number or domain category to generate focused traceability reports.Large codebases have many rules. Filtering enables relevant subset extraction for specific audits or reviews.
RuleInvariantRationale
Empty dataset produces valid zero-state viewsAn empty input produces a MasterDataset with all counts at zero and no groupings.Generators must handle the zero-state gracefully; a missing or malformed empty dataset would cause null-reference errors across all rendering codecs.
Status and phase grouping creates navigable viewsPatterns are grouped by canonical status and sorted by phase number, with per-phase status counts computed.Generators need O(1) access to status-filtered and phase-ordered views without recomputing on each render pass.
Quarter and category grouping organizes by timeline and domainPatterns are grouped by quarter and category, with only patterns bearing the relevant metadata included in each view.Timeline and domain views must exclude patterns without the relevant metadata to prevent misleading counts and empty groupings in generated documentation.
Source grouping separates TypeScript and Gherkin originsPatterns are partitioned by source file type, and patterns with phase metadata appear in the roadmap view.Codecs that render TypeScript-specific or Gherkin-specific views depend on pre-partitioned sources; mixing sources would produce incorrect per-origin statistics and broken cross-references.
Relationship index builds bidirectional dependency graphThe relationship index contains forward and reverse lookups, with reverse lookups merged and deduplicated against explicit annotations.Bidirectional navigation is required for dependency tree queries without O(n) scans per lookup.
Completion tracking computes project progressCompletion percentage is rounded to the nearest integer, and fully-completed requires all patterns in completed status with a non-zero total.Inconsistent rounding or a false-positive fully-completed signal on an empty dataset would misrepresent project health in dashboards and generated progress reports.
Workflow integration conditionally includes delivery process dataThe workflow is included in the MasterDataset only when provided, and phase names are resolved from the workflow configuration.Projects without a delivery workflow must still produce valid datasets; unconditionally requiring workflow data would break standalone documentation generation.
RuleInvariantRationale
Context - PoC limitations prevent monorepo-scale operationThe document generator must produce correct, deduplicated output and surface all errors explicitly before operating at monorepo scale.Silent failures and duplicated content in the PoC corrupt generated docs across all 210 target files, making bugs invisible until downstream consumers encounter broken documentation.
Decision - Robustness requires four coordinated improvementsRobustness improvements must be implemented as four distinct, coordinated modules (deduplication, validation, warning collection, file validation) rather than ad-hoc fixes.Scattering reliability fixes across existing code creates coupling and makes individual concerns untestable; isolated modules enable independent verification and replacement.
Duplicate content must be detected and mergedNo two sections in a generated document may have identical content fingerprints; duplicates must be merged into a single section with source attribution.Duplicate sections confuse readers and inflate document size, undermining trust in generated documentation as a reliable replacement for manually maintained docs. Content fingerprinting identifies duplicate sections extracted from multiple sources. When duplicates are found, the system merges them intelligently based on source priority.
Invalid source mappings must fail fast with clear errorsEvery source mapping must pass pre-flight validation (file existence, method validity, readability) before any extraction is attempted.Without pre-flight validation, invalid mappings produce silent failures or cryptic runtime errors, making it impossible to diagnose configuration problems at monorepo scale. Pre-flight validation catches configuration errors before extraction begins. This prevents silent failures and provides actionable error messages.
Warnings must be collected and reported consistentlyAll non-fatal issues during extraction must be captured in a structured warning collector grouped by source, never emitted via console.warn.Scattered console.warn calls are lost in CI output and lack source context, making it impossible to trace warnings back to the configuration entry that caused them. The warning collector replaces scattered console.warn calls with a structured system that aggregates warnings and reports them consistently.
Consequence - Improved reliability at cost of stricter validationExisting source mappings that previously succeeded silently may now fail validation and must be updated to conform to the stricter checks.Allowing invalid mappings to bypass validation would preserve the silent-failure behavior the robustness work was designed to eliminate.
RuleInvariantRationale
Document metadata is correctly setThe validation rules document must have the title “Validation Rules”, a purpose describing Process Guard, and a detail level reflecting the generateDetailFiles option.Accurate metadata ensures the validation rules document is correctly indexed in the generated documentation site.
All validation rules are documented in a tableAll 6 Process Guard validation rules must appear in the rules table with their correct severity levels (error or warning).The rules table is the primary reference for understanding what Process Guard enforces — missing rules would leave developers surprised by undocumented validation failures.
FSM state diagram is generated from transitionsWhen includeFSMDiagram is enabled, a Mermaid state diagram showing all 4 FSM states and their transitions must be generated; when disabled, the diagram section must be omitted.The state diagram is the most intuitive representation of allowed transitions — it answers “where can I go from here?” faster than a text table.
Protection level matrix shows status protectionsWhen includeProtectionMatrix is enabled, a matrix showing all 4 statuses with their protection levels must be generated; when disabled, the section must be omitted.The protection matrix explains why certain edits are blocked — without it, developers encounter cryptic “scope-creep” or “completed-protection” errors without understanding the underlying model.
CLI usage is documented with options and exit codesWhen includeCLIUsage is enabled, the document must include CLI example code, all 6 options, and exit code documentation; when disabled, the section must be omitted.CLI documentation in the validation rules doc provides a single reference for both the rules and how to run them — separate docs would fragment the developer experience.
Escape hatches are documented for special casesWhen includeEscapeHatches is enabled, all 3 escape hatch mechanisms must be documented; when disabled, the section must be omitted.Escape hatches prevent the validation system from becoming a blocker — developers need to know how to safely bypass rules for legitimate exceptions.
RuleInvariantRationale
Warnings are captured with source contextEach captured warning must include the source file path, optional line number, and category for precise identification.Context-free warnings are impossible to act on — developers need to know which file and line produced the warning to fix the underlying issue.
Warnings are categorized for filtering and groupingWarnings must support multiple categories and be filterable by both category and source file.Large codebases produce many warnings — filtering by category or file lets developers focus on one concern at a time instead of triaging an overwhelming flat list.
Warnings are aggregated across the pipelineWarnings from multiple pipeline stages must be collected into a single aggregated view, groupable by source file and summarizable by category counts.Pipeline stages run independently — without aggregation, warnings would be scattered across stage outputs, making it impossible to see the full picture.
Warnings integrate with the Result patternWarnings must propagate through the Result monad, being preserved in both successful and failed results and across pipeline stages.The Result pattern is the standard error-handling mechanism — warnings that don’t propagate through Results would be silently lost when functions compose.
Warnings can be formatted for different outputsWarnings must be formattable as colored console output, machine-readable JSON, and markdown for documentation, each with appropriate structure.Different consumers need different formats — CI pipelines parse JSON, developers read console output, and generated docs embed markdown.
Existing console.warn calls are migrated to collectorPipeline components (source mapper, shape extractor) must use the warning collector instead of direct console.warn calls.Direct console.warn calls bypass aggregation and filtering — migrating to the collector ensures all warnings are captured, categorized, and available for programmatic consumption.
RuleInvariantRationale
Input codec parses and validates JSON in a single stepEvery JSON string parsed through the input codec is both syntactically valid JSON and schema-conformant before returning a typed value.Separating parse from validate allows invalid data to leak past the boundary — a single-step codec ensures callers never hold an unvalidated value.
Output codec validates before serializationEvery object serialized through the output codec is schema-validated before JSON.stringify, preventing invalid data from reaching consumers.Serializing without validation can produce JSON that downstream consumers cannot parse, causing failures far from the source of the invalid data.
LintOutputSchema validates CLI lint output structureLint output JSON always conforms to the LintOutputSchema, ensuring consistent structure for downstream tooling.Non-conformant lint output breaks CI pipeline parsers and IDE integrations that depend on a stable JSON contract.
ValidationSummaryOutputSchema validates cross-source analysis outputValidation summary JSON always conforms to the ValidationSummaryOutputSchema, ensuring consistent reporting of cross-source pattern analysis.Inconsistent validation summaries cause miscounted pattern coverage, leading to false confidence or missed gaps in cross-source analysis.
RegistryMetadataOutputSchema accepts arbitrary nested structuresRegistry metadata codec accepts any valid JSON-serializable object without schema constraints on nested structure.Registry consumers attach domain-specific metadata whose shape varies per preset — constraining the nested structure would break extensibility across presets.
formatCodecError produces human-readable error outputFormatted codec errors always include the operation context and all validation error details for debugging.Omitting the operation context or individual field errors forces developers to reproduce failures manually instead of diagnosing from the error message alone.
safeParse returns typed values or undefined without throwingsafeParse never throws exceptions; it returns the typed value on success or undefined on any failure.Throwing on invalid input forces every call site to wrap in try/catch — returning undefined lets callers use simple conditional checks and avoids unhandled exception crashes.
createFileLoader handles filesystem operations with typed errorsFile loader converts all filesystem errors (ENOENT, EACCES, generic) into structured CodecError values with appropriate messages and source paths.Propagating raw filesystem exceptions leaks Node.js error internals to consumers and prevents consistent error formatting across parse, validate, and I/O failures.