How One Side Project Teaches You The 2026 Engineering Job

Pick the project. Direct the agents. Ship the weekend.

Alessandro Magionami and Manuel Salvatore Martone

May 22, 2026

On April 30 at 7:34am, we posted a Note.

Eight short lines on Substack. Less than a minute to type. Here's the whole thing:

"AI won't eliminate engineering jobs, but it's rewriting the job description. The new skill isn't writing code. It's knowing WHAT to build, WHY it matters, and HOW to direct agents that write it."

By that afternoon, Cursor's research team published the long-form version. Same date, same thesis, different scale.

Theirs is titled Continually Improving Our Agent Harness. It runs eight thousand words. It defines a "harness" as the system around the model, including system prompts, tool descriptions, context management, and model-specific customizations. They wrote:

"Much of the work is vision-driven, where we start with an opinion about what the ideal agent experience should look like."

Two pieces, same day, same idea. Our Note: "direct agents." Cursor's post: "vision-driven."

Neither of us said the obvious next thing.

The side project is where you learn this skill, before your day job rewrites itself around you.

(No fluff. No theory. The harness scaffold you run tonight.)

Your weekend isn't for typing. Your weekend is for steering one agent through one rep, with the harness you keep refining.

This is the post we should have written instead of the Note.

What just got cheaper, and what just got expensive

Three shifts in the last 60 days. The job market shifted with them.

1. Typing got cheap

Code generation is no longer the bottleneck. A solo developer with an agent ships in one Saturday what a small team ships in a sprint. We see it in our own weekends — autom8n's last three integration points were one prompt each.

What used to be the rate-limiting step is now five minutes of an agent's compute.

2. Directing got expensive

Open any Frontier Lab JD board. You'll find titles like Forward Deployed Engineer and AI Agent Engineer now. The keyboard time is shrinking. Eval-design time, prompt-iteration time, agent-supervision time: all expanding.

Engineers still type. Typing is just no longer what gets them paid.

3. The loop has a name now

On April 2, Martin Fowler's site published Harness engineering for coding agent users, by Birgitta Böckeler. The verbatim opening of the steering section reads:

"The human's job in this is to steer the agent by iterating on the harness."

The harness, per Böckeler: everything in an AI agent except the model itself. Guides (feedforward: your CLAUDE.md, your system prompt, your project rules). Sensors (feedback: your evals, your tests, your output checks). The human's job is to iterate both.

Three sources from three camps converged in 30 days on the same verb. Fowler's site is the principal. Cursor's research is the practice. Solo devs are the practice ground.

Steering, not typing.

The steering loop

The skill is one loop, repeated.

✦ Read

Read what the agent wrote in the last run. Not skim — read. This is the rule Simon Willison named on May 6, in his post on the line between vibe coding and agentic engineering. The verbatim:

"agentic engineering where you are a professional software engineer. You understand security and maintainability and operations and performance."

If you can't read the code your agent writes, you're not directing it. You're hoping.

Code-literacy is the floor under WHAT/WHY/HOW. It is not what WHAT/WHY/HOW replaces.

✦ Guide

Open harness/guides/CLAUDE.md (rename per agent: AGENTS.md, .cursorrules, whatever your stack expects). Add the one rule that would have prevented the worst thing the agent did last run. One rule per rep. Not a refactor of the file. One rule.

The guide compounds. After two reps, it's generic. After ten, it's stack-shaped. After thirty, it knows things about your repo you'd already forgotten.

✦ Run

Hand the next ticket to the agent. Don't watch keystrokes. Watch the output.

✦ Sense

Pass the output through your sensors. The minimum sensor for a side project: did the agent ship code that does what the spec said? The richer sensor: does it pass the /pre-flight 4-question check we shipped on May 15 — URL works, user flow completes, promises resolve, idea hasn't drifted?

If the sensors flag something, the rule from this rep gets added to the guide. If the sensors are silent, the rep was useful — but the guide learned nothing. Append a note to the steering log anyway.

✦ Log

Open harness/steering-log.md. One line per rep: date, what changed in the guide, what the sensor caught. Three months of these lines is the diff between a generic AI workflow and one that knows your repo.

A weekend rep, walked through

Saturday morning. Fresh idea. Open the repo.

You run /steering-loop In the empty directory. The skill asks four questions: which agent, which framework, your one named first user, and your kill condition. You answer in 90 seconds.

The skill scaffolds harness/. Inside: guides/CLAUDE.md (a starter rules file derived from your stack), sensors/eval.md (a checklist primed for /pre-flight), steering-log.md (empty, two columns).

First rep at 10am. The agent ships a Next.js scaffold with the wrong auth provider — you wanted magic-link, it picked OAuth.

You read the code. You update harness/guides/CLAUDE.md with one rule: "Default to magic-link auth via Resend. Use OAuth only when the user explicitly asks." You append one line to steering-log.md.

Second rep before noon. Same project, next ticket. The agent reads the updated guide. Magic-link auth ships correctly the first time. The wasted hour from the first rep just bought you the eighth Saturday. And the 24th. And the 60th.

That's the leverage. The first rep paid for the next sixty.

Yesterday: dragging n8n nodes, debugging connections, JSON typos at midnight. Counted the Saturday hours you've burned on it?

That's the real bill. It doesn't show up on your cloud invoice. It shows up as Sundays you didn't ship.

Autom8n skips it. Type the workflow in plain English. Get the n8n JSON in 5 minutes — with a guide so you actually understand what's running. The first one needs a tweak. The fifth saves you a Saturday.

Describe one workflow →

The `/steering-loop` skill

We packaged the Read → Guide → Run → Sense → Log loop into a skill that scaffolds the discipline the moment you run it.

Install once. Pick whichever fits your stack:

# Cross-agent (Claude Code, Codex, Cursor, Gemini CLI, Copilot, +35 more)
npx skills add Ship-With-AI/skills --skill steering-loop

# Or Claude Code native (plugin marketplace, two commands)
claude plugin marketplace add Ship-With-AI/skills
/plugin install ship-with-ai@ship-with-ai-skills

Then, in your project directory:

/steering-loop

Four questions: which agent + which framework + the one named first user + your kill condition.

Then five things in one pass.

What the skill produces

✦ Artifact 1: the harness/ directory

A new top-level folder in your project root. Three children: guides/, sensors/, and steering-log.md. Each file ships with a one-paragraph header explaining what it's for, so you don't have to remember.

✦ Artifact 2: a starter guides/CLAUDE.md

Pre-populated with rules derived from your stack: your package.json, your README, your existing .cursorrules if any. Not generic. Tailored to what's already in your repo.

✦ Artifact 3: sensors/eval.md

A 5-checkbox checklist primed for the next /pre-flight run. Four checkboxes map to the four pre-flight questions. The fifth is the steering-log check: did anything change in the guide this rep?

✦ Artifact 4: steering-log.md

Two columns: date / what the guide learned. Empty when you start. After ten reps, this file is the only doc that knows what your harness has learned about your repo.

✦ Artifact 5: a RUN_REP.md cheat sheet

The Read → Guide → Run → Sense → Log loop is printed at the top of your project. Five steps. One screen. The harness is the curriculum; the cheat sheet is the lesson plan.

The harness is the curriculum. Every rep is one lesson. The steering log is the transcript.

Three mistakes we made before writing this

1. We built the harness per-project instead of portfolio-root. First side project: a scrappy CLAUDE.md evolved over eight weekends. Second side project: started from scratch. Half the rules from the first one were portable — none of them transferred because the guide lived inside the repo, not above it.

(Yes, six weeks of guide-iteration sat unused. Yes, we noticed too late. Yes, the harness/ Directory now lives at our portfolio root, with per-project overrides.)

2. We trusted the agent's output without reading the diff. On a Saturday in late April, our agent shipped a database migration. The migration ran cleanly. The test suite passed. Sensors green. We never read the SQL. The agent had silently dropped a foreign key constraint to avoid a circular dependency. Three weekends later, an orphan row took down the dashboard.

(Yes, code-literacy is the floor. Yes, our sensor didn't check for dropped constraints. Yes, Willison's rule is now pinned in our guides.)

3. We skipped the steering log for "just one weekend." The log felt like overhead during a sprint. We figured we'd remember which rules came from which rep. By rep eleven, the CLAUDE.md had six contradictory entries, and nobody knew which had failed and which had succeeded.

(Yes, the log is two columns. Yes, six contradictory rules are what happen when you skip it. Yes, the log now gets one line before the agent runs, not after.)

The reframe

The Note was right. The new skill is knowing WHAT, WHY, HOW. It just didn't name the loop.

The loop is steering. The loop has five moves. The loop ships in 30 minutes on a side project that nobody but you cares about.

That's the rep.

You don't learn this skill by reading thinkpieces. You don't learn it by watching a coding-with-Cursor screencast. You learn it the same way you learned every other engineering skill — by opening one repo, attempting one rep, writing down what changed, and trying again.

The day job will rewrite itself around this soon enough. The companies hiring "Forward Deployed Engineers" today have already started pricing it in. Your side project, the one you've been building on Saturdays with the AI agent that keeps almost shipping, is where you've already been practicing. Make it deliberate now.

If you're a developer reading this and you still think the job is typing, you're reading the moment wrong. Typing was the job for forty years. Steering will be the job for the next ten. The side project is where you learn it for free, on your own time, before someone pays for it.

Typing just got cheap. Steering just got expensive.

Pick the project.

Direct the agents.

Ship the weekend.

— Ale & Manuel

PS. Reply with your first steering-log.md after one weekend. Best three logs get featured in a follow-up edition on what readers learned about their own stacks in one rep.

When you're ready, two ways we can help:

1. AI Side-Project Clarity Scorecard. Find what's blocking you from shipping your first side project.

2. NoIdea. Pick a ready-to-start idea born from real user problems.

Ship With AI

Discussion about this post

Ready for more?