03 Operate

Stay shipping.

Production AI doesn't stay sharp on its own. Models drift, costs creep, edge cases pile up, and your team has a roadmap for the thing they actually own. We embed as a fractional engineering partner — new features, monitoring, evals, cost optimization — at a fraction of a senior in-house hire. 30-day exit clause. No retainers without value.

30days
Exit clause
$8K/mo+
Starting retainer
Slack
First cadence
Weekly
Working demos
— What you get

A senior team, fractionally embedded.

Operate isn't "we'll fix it if it breaks." It's a working partnership with a continuous shipping cadence — your AI gets sharper every month, not just maintained. Same engineers who built it stay on the wheel.

Continuous feature shipping

New features land every 1–2 weeks, not every quarter. Your AI improves visibly month over month — that's the whole point of staying with us.

Weekly demos

Monitoring + on-call

Model latency, cost, error rate, drift signals — wired to alerts. We respond to incidents during business hours; out-of-hours on-call available on Heavy retainers.

Datadog · Langfuse · PagerDuty

Eval suite maintenance

The eval suite that gates every prompt change, model swap, and feature ship. We grow it as the system grows. New edge cases get tests before they get fixes.

Braintrust + golden sets

Cost & latency optimization

LLM bills sneak up. We tune prompts, route to cheaper models when quality holds, cache aggressively, and report savings monthly. Most clients see 30–60% cost reductions in the first quarter.

Monthly savings reports

Slack-first cadence

We live in your Slack. No weekly status calls. No Jira ceremony. Async demos, async standups, async reviews. Synchronous time is reserved for the things that actually need it.

Async-first

Eventual handoff prep

If the goal is in-housing this in 6–12 months, we work toward that. Documentation, runbooks, and pair programming with your hires. We exit cleanly, not gradually.

Optional path
— How a typical week runs

A repeatable rhythm. Demos every Friday.

No surprise sprints. No "where's the team this week?" The same beat every week so you always know what we're shipping and why.

Mon
Roadmap
15-min async planning thread. We post the week's priorities in Slack with rationale; you react with thumbs or push back.
Tue
Build
Heads-down build. PRs land continuously. Comment on Slack threads or PRs whenever you want — you're not blocking us; we're not blocking you.
Wed
Build + review
Mid-week pulse: brief status note in Slack with what shipped and what's at risk. You can steer before Friday's demo.
Thu
Polish + evals
Edge case grinding. Eval runs. Cost reports. Whatever's about to ship gets pressure-tested against the regression suite.
Fri
Demo + ship
Async Loom demo of the week's work. Production deploys for everything green. Monthly retro on the last Friday of each month.
— What it costs

Three retainer sizes. Month-to-month.

Pick the capacity that matches your shipping pace. Scale up, scale down, or pause month-to-month with a 30-day notice. We'd rather lose a retainer than keep one we're not earning.

Light
from$8K/mo

~20 hrs/wk · monitoring & small features. Best for a stable system that needs a senior pair of hands a few days a week.

Heavy
from$35K/mo

~80 hrs/wk · dedicated 2-person team · out-of-hours on-call · architecture leadership. Best for mission-critical AI you can't afford to let drift.

30-day exit clause. No questions.
Scale capacity up or down between months
Pause for 1 month/year without losing the slot
— Recent Operate engagements

What "stay shipping" looks like.

Two recent retainers, anonymized. Both began as Build engagements with us and rolled into Operate at handoff.

Legal · 24-attorney firm · Standard retainer

Cut LLM costs 47% in 90 days while shipping 4 new features.

Post-launch optimization rotated 80% of routine queries to a smaller cheaper model with no quality loss, added prompt caching, and shipped 4 new attorney-requested features (clause comparison, jurisdiction filter, exhibit linker, deal-archive search).

−47%
Monthly LLM cost
4 new
Features in 90 days
Insurance · Commercial broker · Heavy retainer

3 new carriers integrated, 2× more quote types, zero downtime.

Quote engine expanded from 1 line of business to 3, integrated 3 additional carrier APIs, added underwriter-assist for high-touch deals. 99.97% uptime over 6 months on the retainer. On-call rotation handled 14 incidents — zero customer-visible.

99.97%
Uptime · 6 months
Quote-type coverage
— Common questions

Things people actually ask.

Do we have to do Build with you first?
Most Operate clients come from a Build engagement, but no — we'll take over an AI system someone else built if it's in good shape. We start with a 1–2 week audit (priced separately) so we know exactly what we're inheriting before signing a retainer. About 15% of audits end with "you'd be better off rebuilding" — we say so when that's true.
How fast can we cancel if it stops being valuable?
30 days notice. No penalty, no termination fee, no clawback. We hand back everything (code, runbooks, eval suites, prompts) in clean shape on the way out. We'd rather you leave when it stops being valuable than stay because you're locked in.
What if we want to in-house this eventually?
We'll help. Roughly 25% of Operate engagements end with handoff to an in-house hire. We pair-program with your engineer, write transition docs, run knowledge sessions, and stay available for questions for 30 days after the formal end. We'd rather you have a working in-house team than a forever-retainer that's lost its purpose.
Can capacity scale up or down month-to-month?
Yes. Move between Light → Standard → Heavy with 30 days notice. Need a temporary surge for a launch? We can spin up extra capacity for a single month. Need to pause during a quiet quarter? Once a year, you can pause one month without losing your slot or onboarding cost when you resume.
Do you provide on-call SLA?
Light + Standard cover business-hours response (9–6 CT, <2hr response on incidents). Heavy includes out-of-hours on-call rotation with PagerDuty integration and a 30-min response SLA on Sev-1 incidents. We don't pretend to offer 24/7 white-glove on a $8K retainer — pricing reflects coverage.
Do you take on AI systems other agencies built?
Sometimes. Always after a paid 1–2 week audit so we know what we're signing up for. If the codebase is healthy and the architecture is salvageable, we'll happily take it on. If it's not, we'll tell you what it'd cost to rebuild — and recommend who could do it (us or someone else).

Want to stay shipping?

Book a 30-min call. We'll talk about what's live, what's drifting, and what'd actually improve if you had a senior team on it part-time.

No pitch deck. No sales pressure. Just a real conversation.