Living Specs for AI Agents

April 15, 2026

Every morning when I logged back in, I had to re-educate the AI agent. What are we building, where are we in it, what did we decide yesterday, what’s the convention for this thing and the constraint on that thing and the reason we went with option B instead of A. Ten minutes sometimes, twenty other times. Every single day.

Multiply that across a week and across a team and it adds up fast. AI agents don’t remember. Every session is day one and that’s not changing any time soon, so the question is what you do about it.

The answer I landed on is simple. Write it down once, in a place the agent reads on startup, and stop re-typing it. For Claude Code that file is CLAUDE.md. For Kiro it’s AGENTS.md or something similar. The name doesn’t matter as long as the agent reads it before it does anything else.

This isn’t a new idea. .cursorrules, .aider.conf, and similar files have been doing something like this for years. What changed is that modern agents are better at actually acting on them, and the cost of keeping them updated is low enough to be worth it.

So that’s where I put the things I was repeating every morning. The stack, the branches, the conventions, rules like “never push from an agent” and “only commit when I ask.” The stuff a new teammate would want on day one, because that’s exactly what the agent is, a new teammate every single morning.

A trimmed version of mine:

# project-name

Web app built with Astro, deployed on Vercel. This file is the
single source of truth for any agent working in this repo. Read
it completely before doing anything else.

## Rules

- DO NOT push code. Ever. Only the human pushes.
- DO NOT commit unless explicitly told to.
- DO NOT merge to `main` without explicit approval. `main` is
  production. Merging is deploying.
- DO NOT start work without a spec. No spec, no code.
- DO NOT invent outside scope. If it's not in a spec and wasn't
  explicitly requested, don't do it.
- DO NOT deviate from specs. If code contradicts a spec, stop
  and flag it.
- DO NOT generate filler. No AI slop, no hedging, no padding.
  If it reads like AI wrote it, rewrite it or flag it.
- DO NOT guess when you can ask.
- Update this file when you learn something durable about the
  project. The next session depends on it.

## Spec-driven workflow

All work is driven by specifications. No exceptions.

- `docs/system-specs/` describes the current state of the project.
  These are living documents. If the system changes, the spec
  changes. If they disagree, one of them is wrong and you fix it.
- `docs/task-specs/` holds one file per piece of work, prefixed
  with the date: `2026-04-15_some-task.md`. Once the task ships
  the spec is frozen. It becomes a historical record of what was
  built and why. Never rewrite a shipped task spec.
- Before starting any work, check for a task spec. If none exists,
  create one first.

## Content workflow

1. Human shares an idea. Agent categorizes it and routes it to
   the content bank or a task spec.
2. When ready, agent creates a dated task spec. Ask probing
   questions to get the real story before writing begins. Don't
   fill gaps, surface them.
3. Draft on a branch. Never pad. If a section is thin, ask for
   more rather than generating generic content.
4. Review via PR. Every sentence earns its place.
5. Merge only with explicit approval.

## Branches

- `main` is production. What's in main is live.
- `develop` is the integration branch for ongoing work.

I put one rule at the top of the file. It’s a living document. When the agent learns something durable about the project, it updates the file itself. The next morning that context is already there and I don’t re-type it.

It only works if you keep the file honest. Skip an update after something changes and the next session works from a stale picture, which is worse than starting from nothing.

Two kinds of specs

Once the entry point was sorted I had to deal with the rest of the docs. I used to have a flat docs/ folder with a pile of specs. Some described the current state of the system and others described a specific piece of work I’d done six months ago. They lived next to each other and drifted apart. That confused me and it definitely confused the agent.

So I split them. docs/system-specs/ is living documentation that describes what the system is right now. If the system changes, the spec changes. If the spec and the code disagree, one of them is wrong and you fix it. docs/task-specs/ is frozen, one file per piece of work, prefixed with the date: 2026-04-15_some-task.md. Once the task ships the spec becomes a historical record. You don’t rewrite it, it’s the timeline of what got built and why.

The agent knows which is which because they live in different folders. When I start a new task the agent creates a task spec with today’s date, we iterate on it, and once I approve it implements against the spec. When it’s done the spec stays and a month from now if I want to know why I did something the dated file is right there.

Beyond code

None of this is specific to code, which I didn’t expect when I started. This blog you’re reading right now is set up the same way. There’s a CLAUDE.md that tells the agent how I write and what my content pillars are, a system-specs folder with the content strategy and post conventions, and a task-specs folder with a dated spec for every post including this one. When I sit down to write, the agent already knows my voice and structure, and it asks me for the story before filling in blanks. I don’t have to explain any of that.

Any domain where you work with an AI agent over time can use this. Research, consulting, writing a book, running a small business. Anywhere you’d otherwise be explaining the same context over and over.

I set this up for myself to stop repeating context every morning. But the real payoff showed up when someone else joined the work. They didn’t have to ask me how anything was organized. They opened the repo, their agent read the file, and they were in the work. That’s what happens when you write things down once. The file doesn’t care who opens it, whether that’s a human, an agent, or future-you who forgot.

This is persistent memory, file-based. The CLAUDE.md and the system specs change with the project. The task specs just pile up. Some of it stays current and some of it becomes historical context. It’s memory that lives in the repo instead of inside the model.

Some tools are starting to build this in natively. Claude has a memory feature in the app and in Claude Code. It remembers things about you across conversations. But from my experience it’s still thin. It captures high-level facts about you and misses the depth of how you work on a specific project day to day.

The spec structure fills that gap for project-level memory. But there’s still a missing layer above it. The stuff about you that should follow you everywhere regardless of which repo you’re in. How you like to work, how you communicate, your preferences and patterns that aren’t tied to one codebase. That’s personal memory and it’s the piece that’s still not solved well. I’ll write about that next.

For now, if you’re still re-educating your agent every morning, write it down once and let the repo brief the next session for you.