30 Days of AI Collaboration: Recovering 55,000 Turns of Professional Capital

In the spring of 2026 I spent 30 days living in the “jagged frontier” of AI-assisted software engineering.

If you’ve worked with Claude Code, Gemini CLI, Cursor, or Codex you know the feeling. A high-intensity session ends, the terminal clears, and months of architectural decisions, hard-won debugging insights, and technical pivots vanish into the ephemeral void of chat history. The next morning you remember the outcome but not the reasoning that got you there.

I decided to stop losing that knowledge. This is the data-driven retrospective of 466 sessions, 55,224 turns, and 1,772 technical decisions recovered from a single month of work, and what I found when I went back through the corpus systematically.

Turns
55,224
Sessions
466
Days
30
April 2026 · Claude Code + Gemini CLI + Cursor + Codex · high-fidelity cohort

A note on timing

I built the pipeline I’m about to describe in early April. On May 6th (the day before this post) Anthropic announced a “dreaming” feature for Managed Agents : cloud-managed agents that schedule reflection time between sessions to review past interactions, spot patterns, correct errors, and refine persistent memory.

That is, structurally, the same idea I’d been building. Mine runs locally, ingests every provider I touch (not just Claude), and stores the extracted memory as files I own. Theirs is shipped as a feature inside one vendor’s stack.

Both versions of the idea are interesting. The post below is about the version where the data is yours.

The infrastructure: the Librarian

To analyze a month of conversation I built a pipeline called the Librarian. Its job is to ingest raw session logs from multiple AI providers, normalize them into a universal Markdown schema (the 5Ws: Who, What, When, Where, Why), and extract structured metadata:

  • Decisions made: the architectural pivots that actually made it into code.
  • Corrections issued: where I had to steer the AI back on track.
  • Learning velocity: how quickly a topic moves from “open question” to “matured.”
  • Tech mentions: what the conversation actually talks about, regardless of what file is open.

The pipeline is unapologetically file-based. Each session gets a Markdown record with YAML frontmatter; each extracted artifact is a tagged block inside it. Everything is grep-able, version-controlled, and survives any vendor lockout.

The scale of collaboration

Between April 1 and May 1, 2026, the volume was higher than I expected:

Decisions per project · April 2026
Silo
591
Command Center
377
PKM Vault
349
Lens
168
1,485 decisions across the four primary projects (84% of the 1,772 total). Smaller projects (thedetech, Tractor finance, homeschool, side experiments) make up the long tail. Bars normalized to Silo (1.0).

This isn’t “chatting.” It’s a high-bandwidth integration where the model is functioning as a junior engineer, a peer reviewer, and an architectural critic, sometimes in the same session. The Silo project alone (a Rails web app plus a SwiftUI iOS companion plus a marketing surface) generated 591 distinct technical decisions in 30 days. Many of those decisions never made it into a commit message or a PR description. They lived in chat. Now they live in the Vault.

Finding 1: the decision graph

Most engineers’ documentation is a graveyard of outdated READMEs. Extracting decisions_made directly from chat transcripts makes the documentation live.

Examples from the corpus:

  • “Rename ‘Structure with AI’ to ‘Summarize with AI’ for UX clarity.”
  • “Switch the iOS provider client from polling to webhook-first.”
  • “Use clean paragraph breaks instead of horizontal rules in AI output.”
  • “Adopt MLX-Swift over Ollama for the macOS embedded inference path.”

These aren’t just logs. They’re a strategic pivot map. If I need to remember why I switched to MLX-Swift on April 22nd (what the constraint was, what the alternative was, why the trade landed where it did), the answer isn’t in a PR description. It’s in the recovered turn-by-turn reasoning of the session that produced the change.

Finding 2: learning velocity

How fast can you learn a new domain when you have a high-fidelity AI partner? I tracked the maturity arc of 4,400+ technical topics across the corpus, classifying each into one of four states based on whether it ever reached a recorded decision.

Topic maturity distribution · 4,447 tracked topics
Matured · 2,359
In Progress · 2,055
53.0% reached a decision46.2% still in flight0.8% discovery/investigation
A topic "matures" when a session produces a recorded decision against it. "In progress" topics have at least one mention but no decision yet. They're queued, not dead.
  • High velocity at the top: topics like “local-first AI strategy”, “PR triage”, “alternative-to landing pages”, and “competitive graveyard” moved from first mention to recorded decision in a single session.
  • The almost-even split: the matured-vs-in-progress balance is the part that surprised me. Roughly half the things I touched in April closed the same day they opened. The other half are still open, which means there’s a backlog of good questions that the Vault has captured for me to pick up next.
  • The compounding effect: as the Vault grew, the AI’s ability to “remember” prior architectural patterns increased, because I could feed back relevant prior sessions as context. We weren’t repeating ourselves; we were stacking.

The compounding is the deeper finding. The Vault isn’t just a record of what happened. It becomes input to the next session, and over weeks, that input compresses what would otherwise be re-derivation.

What April was actually about

The Vault answers a question that was harder to answer before: what did I spend the month thinking about? Here are eight of the topics that recurred most across April’s sessions. None of these were planned in a doc anywhere. They emerged from the work, were extracted by the Librarian, and then became their own searchable wiki entries.

Selected topics from April · sessions touched
Intelligence Briefing Triage21
Agentic Virtual Office17
Launch Readiness10
LM Studio Integration9
Editorial Triage9
Command Center Dashboard Redesign7
Agentic Dispatch Workflow7
5Ws Framework6
Curated from the 4,447 topic registries by editorial relevance, not the raw top-N. Auto-tagging produces noise (project labels, session-lifecycle markers) that doesn't tell a useful story. These eight, in counts the Vault recorded, do.

The shape of April reads cleanly off this list: ship readiness for two products, an agentic dispatch system to absorb the work, a portfolio dashboard to keep visibility, local-inference plumbing, and the 5Ws schema that powers the Vault itself. That’s the month, in eight rows.

Finding 3: the friction audit (the jagged edge)

AI is not a silver bullet. The friction audit flagged 45.9% of April sessions as friction-heavy: sessions where high turn counts, recorded errors, or explicit user pushback indicated the AI had stalled and needed manual intervention.

The two recurring failure modes:

  • The Loop pattern. In high-complexity repos like Silo iOS, the assistant occasionally fell into circular reasoning or dependency blindness, proposing the same failing fix three times in a row before I had to break it manually.
  • Vision hallucinations. During vision bake-offs (comparing OCR and transcription models on real letters and screenshots), vision models often hallucinated binary data or formatting markers that simply weren’t in the input. Useful as a benchmark, dangerous if you trust the output blind.

Knowing where the AI fails is as valuable as knowing where it succeeds. Auditing the corrections_issued field surfaced a clear correlation: high turn counts (>100 turns) correlate strongly with a collapse in architectural integrity. If you’re 100 turns deep in a single session and the answers are still wrong, the session is the problem. Start a new one.

The full picture

The charts above pull from a larger dataset than any one section can hold. Below is the full April 2026 dashboard: every section we just walked through plus the ones that didn’t make the editorial cut (token economics where the data exists, the open-questions atlas, the conversion-rate funnel, the top-12 technology mentions, and the highest-friction sessions of the month). It’s a single self-contained HTML page; nothing leaves your browser.

From ephemeral to durable

The thing this experiment changed for me isn’t about the specific tools.

It’s about the shift from treating AI interactions as ephemeral chat to treating them as professional capital. When the conversations are durable, searchable, and structured, you stop building in a vacuum. You start building a Vault: a knowledge graph of your own technical evolution that grows more valuable every time you hit Enter.

Anthropic’s “dreaming” announcement is a vendor-side bet on the same idea: agents that consolidate what they learned between sessions, instead of starting cold every time. I think they’re right that this is the direction. I also think it’s worth building the version where the consolidation happens on hardware you own and the resulting memory is yours, not theirs.

The question isn’t whether you’re using AI. It’s whether you’re keeping the capital you’re creating with it.

In the next post I’ll dig into what was actually in those decisions: the architectural gravity of Ruby on Rails, AI as Git coach, and the open-question clusters that mark the boundary of what AI can answer.