Role, Operations

Claude for
operations

Runbooks, SOPs, incident retrospectives, vendor evaluations, and the thankless documentation that keeps the business running.

Operations is where Claude has the quietest but possibly largest impact across Abingdon. Almost every ops task is either writing a document that explains something, writing a document that captures something, or turning a pile of data into a recommendation. All three are Claude's happy place. The pages never win awards, but the time saved is enormous.


Three places to start

Three workflows that collectively cover most of what an ops function does in a given week.

1. Drafting an SOP from how someone actually does the work

The best SOPs are written from the expert's real mental model, not from a template. Interview the expert, paste your notes, paste an example SOP as a structure reference, and let Claude draft. It will use the expert's language where possible and flag the steps where it needs clarification. What you get back is 80% of the way there.

SOP drafting prompt, annotated

Four elements, applied to turning an expert's head into something the rest of the team can follow.

YouI interviewed Priya about how she handles the monthly close reconciliation. Here are my notes from the interview, pasted below. They are quite rambling because I was trying to capture her exact words. 1 Here is an example SOP from our team for a similar process, as a structure and voice reference, pasted below. 2 Draft a proper SOP matching the reference structure. Purpose, who does this and when, prerequisites, numbered steps with a pass/fail check for each, common things that go wrong, and escalation paths. Use Priya's exact phrasing where possible, especially for the tricky steps. 3 For any step where her notes are unclear or she uses shorthand I cannot interpret, leave a specific question in place of the step rather than guessing. I will go back to her with your list of questions. 4
1
Context. Who the expert is, what the process is, and that the notes are raw.
2
Example. An existing SOP as the voice and structure reference.
3
The ask. Specific sections and an instruction to preserve the expert's actual language.
4
Escape hatch. Claude returns a list of questions rather than inventing steps. This is the whole trick.

Where this falls over

If the expert's notes are incomplete, Claude will fill in plausible-sounding steps that are subtly wrong. Always review every step with the original expert before the SOP goes into circulation.

2. Incident retrospectives without blame

Paste the timeline, paste the chat and ticket history, and ask Claude for a retro covering impact, narrative, what went well, what went badly, root causes, and three specific actions. The key instruction is "no blame, stick to what the evidence supports, flag anything you're inferring". The output is a retro you can share within an hour of the incident closing.

Two ways to ask for an incident retro

The weak version lands at "someone should have caught this". The strong version lands at three specific actions with owners.

Youwrite a retro for yesterday's incident
What Claude sends back

A plausible-sounding retro that invents a timeline Claude doesn't actually have, picks one root cause (probably "alert fatigue"), and recommends generic actions like "improve monitoring". Names one person as having missed a warning. Unusable in the actual retro meeting.

YouHere is the timeline of yesterday's payment processing incident, pasted below. Here is the #incidents Slack history from 14:00 to 18:00, pasted below. Here is the on-call ticket log, pasted below. Produce a retro covering: impact and duration in one line, what happened in a factual narrative, what went well, what went badly, contributing causes (plural, do not stop at the first plausible one), and three specific actions with suggested owners. No blame. Stick to what the evidence supports. Name roles, not individuals, even when a specific person is named in the source. If you are inferring something rather than reading it directly from the evidence, flag it.
What Claude sends back
Payment processing incident, Wed 14:06 to 17:42
Impact: 38 minutes of failed checkouts, 412 affected transactions, all recovered.

A factual narrative built only from the timestamps in the timeline, with "what went well" (monitoring alerted within 90 seconds, rollback was tested and ready) and "what went badly" (the alert went to the wrong rotation for 12 minutes, the runbook linked to a stale dashboard).

Contributing causes (I found three, not one):

  1. A recent change moved the on-call rotation but the alert routing did not follow.
  2. The runbook's dashboard link was not updated after last quarter's Grafana migration.
  3. The retry logic for the downstream payment provider swallows 429s silently, which delayed the first human looking at the issue.
Two things I am inferring rather than reading directly: cause 3 about the retry logic is based on the ticket mentioning a delayed 429 response, but I did not see the actual code. Worth verifying. And the 12-minute alert delay is calculated from timestamps in the Slack history, not from a routing log.

Where this falls over

Claude will default to single-root-cause thinking if you let it. Incidents almost always have multiple contributing causes. Ask it explicitly to avoid stopping at the first plausible cause.

3. Vendor evaluation matrices

You're comparing three vendors. You have notes on each, you have weighted criteria. Claude will score each vendor against each criterion, mark anything where the data is thin as "need more info" rather than scoring it, produce a recommendation with the reasoning, and list the remaining questions you still need to answer. This is exactly the kind of structured thinking that takes humans half a day and Claude ninety seconds.

Scoring three vendors, as a conversation

Notice Claude pushing back when it does not have enough to score.

YouI am evaluating three vendors for our new endpoint monitoring tool. Notes on each pasted below. Here are the weighted criteria: price (20%), integration effort (25%), alert quality (30%), vendor stability (15%), reporting (10%). Score each vendor against each criterion and recommend one.
ClaudeBefore I score, a quick check: for "vendor stability", my notes on Vendor B are thin. Two sentences about company age and one about funding. I can guess, but the guess would be doing most of the work. Do you want me to score it with a low-confidence note, or move vendor stability to a "need more info" column and ask you to supply what you have?
YouMove it to "need more info". Same for anything else where the data is similarly thin.
ClaudeHere is the matrix. Vendor A scored on all five criteria. Vendor B scored on four, with vendor stability flagged. Vendor C scored on four, with reporting flagged because your notes only mention the dashboard, not the export options. Recommendation at the bottom with the reasoning, and a list of the five specific questions you would need to answer before the matrix is complete enough to defend in a meeting.

Where this falls over

Claude will score criteria it has no data on if you don't tell it not to. The "need more info" category must be in the prompt.


Worked examples

Before and after, on tasks from operations work.


Prompts to steal

Copy any of these, fill in the square brackets, run.


Try this week

Pick one. All of these are under an hour.


Operations documentation tends to be the first place in a business that gets out of date. Claude lowers the cost of keeping it fresh enough that "we should really update that runbook" stops being a todo and becomes a Tuesday afternoon task.

If you want to go broader, the Beginner guide covers the habits that apply everywhere. The prompt library has the full set of role-tagged prompts in one place.