Living Specs for AI Agents

Every morning when I logged back in, I had to re-educate the AI agent. What are we building, where are we in it, what did we decide yesterday, what’s the convention for this thing and the constraint on that thing and the reason we went with option B instead of A. Ten minutes sometimes, twenty other times. Every single day.

Multiply that across a week and across a team and it adds up fast. AI agents don’t remember. Every session is day one and that’s not changing any time soon, so the question is what you do about it.

The answer I landed on is simple. Write it down once, in a place the agent reads on startup, and stop re-typing it. For Claude Code that file is CLAUDE.md. For Kiro it’s AGENTS.md or something similar. The name doesn’t matter, what matters is that the agent reads it before it does anything else.

This isn’t a new idea — .cursorrules, .aider.conf, and similar files have been doing something like this for years. What changed is that modern agents are better at actually acting on them, and the cost of keeping them updated is low enough to be worth it.

So that’s where I put the things I was repeating every morning. The stack, the branches, the conventions, rules like “never push from an agent” and “only commit when I ask.” The stuff a new teammate would want on day one, because that’s exactly what the agent is, a new teammate every single morning.

Here’s a trimmed example of what one of these files looks like:

# project-name

Web app built with Astro, deployed on Vercel. This file is the
single source of truth for any agent working in this repo. Read
it completely before doing anything else.

## Rules

- DO NOT push code. Ever. Only the human pushes.
- DO NOT commit unless explicitly told to.
- DO NOT merge to `main` without explicit approval. `main` is
  production. Merging is deploying.
- DO NOT start work without a spec. No spec, no code.
- DO NOT invent outside scope. If it's not in a spec and wasn't
  explicitly requested, don't do it.
- DO NOT deviate from specs. If code contradicts a spec, stop
  and flag it.
- DO NOT generate filler. No AI slop, no hedging, no padding.
  If it reads like AI wrote it, rewrite it or flag it.
- DO NOT guess when you can ask.
- Update this file when you learn something durable about the
  project. The next session depends on it.

## Spec-driven workflow

All work is driven by specifications. No exceptions.

- `docs/system-specs/` describes the current state of the project.
  These are living documents. If the system changes, the spec
  changes. If they disagree, one of them is wrong and you fix it.
- `docs/task-specs/` holds one file per piece of work, prefixed
  with the date: `2026-04-15_some-task.md`. Once the task ships
  the spec is frozen. It becomes a historical record of what was
  built and why. Never rewrite a shipped task spec.
- Before starting any work, check for a task spec. If none exists,
  create one first.

## Content workflow

1. Human shares an idea. Agent categorizes it and routes it to
   the content bank or a task spec.
2. When ready, agent creates a dated task spec. Ask probing
   questions to get the real story before writing begins. Don't
   fill gaps, surface them.
3. Draft on a branch. Never pad. If a section is thin, ask for
   more rather than generating generic content.
4. Review via PR. Every sentence earns its place.
5. Merge only with explicit approval.

## Branches

- `main` is production. What's in main is live.
- `develop` is the integration branch for ongoing work.

I put one rule at the top of the file: this is a living document. When the agent learns something durable about the project, it updates the file. Next session inherits the knowledge and the tax goes to zero.

The one discipline this requires: keep it honest. When you skip an update after the project changes, the next session pays for it with confusion. The file is only as useful as the last time someone kept it current.

Two kinds of specs

Once the entry point was sorted I had to deal with the rest of the docs. I used to have a flat docs/ folder with a pile of specs. Some described the current state of the system and others described a specific piece of work I’d done six months ago. They lived next to each other and drifted apart. That confused me and it definitely confused the agent.

So I split them. docs/system-specs/ is living documentation that describes what the system is right now. If the system changes, the spec changes. If the spec and the code disagree, one of them is wrong and you fix it. docs/task-specs/ is frozen, one file per piece of work, prefixed with the date: 2026-04-15_some-task.md. Once the task ships the spec becomes a historical record. You don’t rewrite it, it’s the timeline of what got built and why.

The agent knows which is which because they live in different folders. When I start a new task the agent creates a task spec with today’s date, we iterate on it, and once I approve it implements against the spec. When it’s done the spec stays and a month from now if I want to know why I did something the dated file is right there.

It’s not just for code

Here’s what I think is the interesting part. None of this is specific to code. This blog you’re reading right now is set up the same way. There’s a CLAUDE.md that tells the agent how I write and what my content pillars are, a system-specs folder with the content strategy and post conventions, and a task-specs folder with a dated spec for every post including this one. When I sit down to write the agent already knows the voice, it knows the structure, and it knows to ask me for the story before it fills in blanks. I don’t have to explain any of that.

Any domain where you work with an AI agent over time can use this. Research, consulting, writing a book, running a small business. Anywhere you’d otherwise be explaining the same context over and over.

I set this up for myself to stop repeating context every morning. But the real payoff showed up when someone else joined the work. They didn’t have to ask me how anything was organized. They opened the repo, their agent read the file, and they were in the work — no orientation, no “how do you have this set up” questions. That’s what happens when you write things down once. The file doesn’t care who opens it, whether that’s a human, an agent, or future-you who forgot.

If you think about it, this is persistent memory. It’s just file-based. The CLAUDE.md mutates as the project evolves, the system specs update when the system changes, and the task specs accumulate over time. Some of it stays current and some of it becomes historical context. That’s how memory actually works, it’s just living in your repo instead of in the model.

Some tools are starting to build this in natively. Claude has a memory feature in the app and in Claude Code. It remembers things about you across conversations. But from my experience it’s still pretty thin, more like sticky notes than real documentation. It captures high-level facts but not the depth of how you actually work on a project day to day.

The spec structure fills that gap for project-level memory. But there’s still a missing layer above it. The stuff about you that should follow you everywhere regardless of which repo you’re in. How you like to work, how you communicate, your preferences and patterns that aren’t tied to one codebase. That’s personal memory and it’s the piece that’s still not solved well. I’ll write about that next.

For now, if you’re still re-educating your agent every morning, stop. Write it down once. Let the project do the teaching.