Index
WORKFLOW AUTOMATION · LOCAL-RUNNABLE PROTOTYPES

Human-in-the-Loop Automation

Five runnable n8n workflows for operations teams — ticket triage, lead scoring, review response, customer onboarding, and an internal SOP bot. Each runs end to end on a local model and routes only the high-stakes cases to a person.

Role
Sole builder — design, prompts, n8n workflows, HIL gates
Domain
Operations automation (hospitality-leaning)
Surface
5 kits on one shared, runnable 5-node spine
Stack
n8n · local Ollama (phi4:14b) · JSON mode · rule-based gate
Principle
AI drafts; a deterministic gate escalates the risky cases
Status
Runnable + dogfood-verified on sample runs · production channels pending
The shared workflow architecture: trigger, load inputs, local LLM, and a rule-based human-in-the-loop gate that splits routine cases (auto-resolve) from high-stakes cases (human approval) before a structured result. Below, each of the five kits is shown with a verified sample input, the model's verdict, and which way the gate routed it.
The shared spine, the auto-resolve vs. human-approval split, and each kit's verified sample run.
01

The Operational Problem

Operations teams run on inbound volume — support tickets, sales leads, public reviews, new-account onboarding, staff policy questions. Most of it is routine and could be handled in seconds. A small slice is genuinely high-stakes: a refund dispute, a safety incident, an enterprise contract, a one-star review naming a guest.

The naive fix — let AI answer everything — is unsafe, because a confident wrong answer lands exactly on that high-stakes slice. The opposite — make a person read everything — spends the scarce resource on cases that never needed it. These kits are built around that tension: absorb the routine volume automatically, and guarantee the high-stakes cases reach a human.

02

The Product Principle

One rule holds across all five: AI drafts, a human approves anything risky — never the reverse. And the escalation decision is not left to the model's discretion alone. Each workflow runs a two-layer gate. First the model returns its own requires_human_approval flag inside strict JSON. Then deterministic code re-checks the case against kit-specific rules and can force escalation regardless of what the model said.

The model is allowed to be cautious; it is never the only thing standing between an automated action and a guest, a dollar, or a safety event.

03

How One Workflow Runs

Every kit is the same runnable spine — five n8n nodes, no cloud API keys, executable on a laptop:

  • Trigger — a sample case enters the workflow.
  • Load Inputs— the case is paired with the kit's system prompt.
  • Local LLM — Ollama (phi4:14b) returns strict JSON (json mode, temperature 0.2).
  • Parse + HIL Gate — the JSON is parsed, then the kit-specific rule decides auto-resolve vs. human approval.
  • Result — a structured object plus the human-approval flag.

The same five kits also ship a design-exact production graph that swaps the local model for real channels and models — covered below.

04

The Five Kits

Five kits, one spine. Each classifies, scores, or drafts in its domain and carries its own escalation rule. Every example below is a verified sample run from the local build.

Hospitality Ticket Triage

Human
Decides
category, priority (low → urgent), department, guest-facing draft reply
Escalates when
priority is high/urgent, or the text mentions refund, chargeback, passport, safety, injury, or legal
Verified run
“$2,400 chargeback”priority: urgentHuman

Lead Qualification + Scoring

Human
Decides
ICP fit score (0–100), tier A/B/C, intent, next-best action, reply
Escalates when
score ≥ 80, tier A, enterprise/security/legal terms, or a borderline lead with hot intent
Verified run
“VP, multi-property, budget approved”fit score 85Human

Review Response Agent

Auto
Decides
sentiment, category, severity, brand-voice public response
Escalates when
negative sentiment or high severity
Verified run
“5-star review”positiveAuto

Customer Onboarding Engine

Auto
Decides
segment, personalization plan, first steps, risk flags, welcome
Escalates when
enterprise/high-value, SSO, security/legal, money-touching, or missing consent
Verified run
“SMB, non-technical”routineAuto

Internal Ops SOP Bot

Human
Decides
answer, cited SOP references, confidence, reply (grounded RAG)
Escalates when
low confidence, or the question touches a spill, safety, HR, refund, or waiver
Verified run
“Refund policy?”cites P1Human
05

From Prototype to Production

Alongside each runnable kit is a design-exact production scaffold — an importable n8n graph that is not yet run with live data. They share a common shape:

  • Multi-channel intake — webhook, email/IMAP, schedule, or Telegram.
  • Normalize — dedup, canonical schema, and consent / lawful-basis checks.
  • Cheap classifier → premium drafter — cost-tiered models, with citation-enforced retrieval (RAG) where there is a knowledge base.
  • Force-HIL validation — the deterministic gate, in code.
  • Mandatory approval — a send-and-wait step in Slack or email, with an approve / redraft / escalate router; execution only fires after approval.
  • Audit + recovery — an immutable log with cost tracking, a dedicated error sub-workflow, and feedback capture for continuous improvement.

Productionizing a kit means swapping the local-Ollama node for its real channels and model, adding credentials, and feeding sanitized data.

Scope & Honesty
  • The five runnable kits execute end to end on local Ollama with no cloud API keys; verified on built-in sample inputs (2026-06-02, gate retest 2026-06-19).
  • The production graphs are design-exact and importable, but have not yet been run with real channels or data.
  • No client metrics or ROI claims — this is sample-set proof of the classify/score/draft plus human-gate logic.
  • Sample inputs are synthetic and the embedded SOP policies are illustrative.