Inngest Idempotency Patterns for the Support-Ticket-to-PR Pipeline

In Watari's pipeline, idempotency is not a defensive nice-to-have — it is the structural property that makes the whole system safe to retry. A single Inngest event replay touching the five-stage flow from a Zendesk or Intercom ticket to a GitHub draft PR can re-fire the extraction model, double-charge the billing meter, open a duplicate pull request, and publish two copies of the same RCA unless every step.run body is hardened against non-determinism and every side effect is guarded by a conditional write.

Why Inngest Replays Are Dangerous in an AI Billing Pipeline

Inngest retries a failed step by re-executing its body from scratch. In a typical CRUD service that risk is low — a UPDATE ... WHERE id = $1 is naturally idempotent. In Watari's pipeline, three properties make retries structurally dangerous:

Non-deterministic steps. Calling an extraction model or a reranking step twice on the same ticket does not guarantee the same output. A second extraction might produce a slightly different confidence score — enough to cross the 0.7 threshold when the first attempt did not, or vice versa.
A billing meter with real money attached. The Mapped Bug meter fires when bugs.ai_confidence ≥ 0.7 AND any bugs.code_locations[].confidence ≥ 0.7. An unguarded retry that re-qualifies the same bug increments the usage meter twice and sends the customer a bill for work Watari already did.
External API calls. The GitHub API, Zendesk webhook write-back, and Slack routing calls are not idempotent by nature. Calling them twice creates two PRs, two Slack messages, two RCA comments.

The engineering contract Watari follows is explicit: step.run bodies must never call Date.now(), Math.random(), or crypto.randomUUID() inside the step body. Any value that must be unique or time-stamped is either derived from stable inputs (ticket ID, bug ID, repository commit SHA) or read from the database where it was written by a prior — completed — step.

The Five Idempotency Gates, in Pipeline Order

The Watari pipeline has five stages: inbound webhook → ticket-to-bug extraction → bug-to-code mapping → code-to-PR generation → bug-to-RCA publish. Each stage has a distinct idempotency primitive. Getting one wrong lets a replay propagate unchecked into every stage downstream.

Gate 1 — Webhook Deduplication Before Inngest Sees Anything

Webhook deduplication happens before any Inngest event is emitted. When a provider event arrives — a Zendesk ticket created, an Intercom conversation updated — the handler first attempts an INSERT into a webhook_events table keyed on (provider, event_id):

INSERT INTO public.webhook_events (provider, event_id, received_at)
VALUES ($1, $2, now())
ON CONFLICT (provider, event_id) DO NOTHING;

The handler checks the affected row count. If zero rows were inserted, this is a replay or a provider retry of an already-processed delivery. The handler returns 200 OK immediately — convincing the provider the event was received — without sending anything to Inngest. No Inngest event means no downstream work to deduplicate.

This is the correct layer for webhook dedup. Doing it inside an Inngest step means Inngest has already accepted the event; the step would need to be idempotent itself and you'd still risk a race between two concurrent webhook deliveries of the same event both passing the step's guard before either has written to the dedup table.

Zendesk and Intercom both include a stable event identifier in their webhook payloads. Watari verifies every inbound webhook's HMAC signature before touching the dedup table, so the webhook_events record is only written for legitimately signed deliveries.

Gate 2 — Extraction Step: Check Before You Call

The extraction step turns a free-form ticket into a structured bug record with severity, repro steps, expected vs. actual behavior, and a confidence score. See ticket to bug for the full field set.

A retry of the Inngest function that runs extraction must not call the extraction model a second time if a qualifying record already exists. The guard pattern:

const existing = await step.run('check-existing-extraction', async () => {
  return db
    .from('bugs')
    .select('id, ai_confidence, extraction_status')
    .eq('ticket_id', ticketId)
    .maybeSingle();
});

if (existing?.extraction_status === 'complete' && existing.ai_confidence >= 0.7) {
  return existing; // short-circuit — no model call
}

The critical discipline is that the guard is itself a step.run call. If you put the guard in the outer function body (outside any step), Inngest will re-execute it on every retry — but it will not memoize the result. Wrapping it in a step means Inngest caches the output after the first successful execution and returns the cached value on all subsequent retries, making the check free after the first run.

Gate 3 — The Single-Source Billing Meter

The Mapped Bug billing meter fires from exactly one place in the codebase: the qualify-mapped-bug Inngest function. No other function touches the meter-increment RPC. This is enforced by code review and by an internal auditor pattern — any call site outside qualify-mapped-bug is a build-blocking violation.

But single-source routing is not enough on its own. The qualify-mapped-bug function itself can be retried, which means the meter guard must live at the database layer, not in application code.

The implementation uses two interlocked mechanisms:

is_billable_mapped_bug flag. The bugs table carries a boolean is_billable_mapped_bug column. The qualify-mapped-bug step attempts a conditional UPDATE:

UPDATE public.bugs
SET is_billable_mapped_bug = true, billed_at = now()
WHERE id = $1
  AND is_billable_mapped_bug IS NOT TRUE
RETURNING id;

If the UPDATE returns zero rows — the flag is already set — the step skips the meter-increment RPC entirely. A retry cannot increment the meter twice because the first successful run already set the flag, and the conditional UPDATE on the flag acts as a distributed lock.

The increment itself is a single atomic statement. The meter-increment RPC is a SECURITY DEFINER function whose body is one UPDATE on the organization's usage counter — atomic at the database level. The idempotency lock is the conditional flag UPDATE above, not the RPC: the increment step runs only after the flag claim succeeds, and a replay that finds the flag already set never reaches the increment at all.

Review the Mapped Bug meter and billing meter docs for the full qualification criteria. Draft PR generation, RCA publish, and issue-tracker sync are bundled and never metered — so the billing gate is the only place that matters.

Gate 4 — Fan-Out PR Orchestration

Once a bug qualifies and code locations are confirmed, the code-to-PR stage fans out: a parent orchestrator function dispatches one generate-PR child function per qualifying bug. This is where idempotency gets structurally harder, because the parent's idempotency does not confer idempotency on its children.

If the orchestrator step that fans out is retried after the children have already been dispatched, Inngest may send duplicate events to the child functions. Each child must therefore carry its own idempotency gate:

const existingPr = await step.run('check-existing-pr', async () => {
  return db
    .from('bugs')
    .select('draft_pr_url, draft_pr_status')
    .eq('id', bugId)
    .maybeSingle();
});

if (existingPr?.draft_pr_url) {
  // PR already opened — skip GitHub API call
  return { prUrl: existingPr.draft_pr_url, skipped: true };
}

The child can also use Inngest's eventId as the PR's internal reference, instead of crypto.randomUUID(). The eventId is stable across retries — Inngest assigns it at event-send time and it does not change on retry — which means the PR reference is deterministic. If the GitHub API call completes but the step crashes before writing the draft_pr_url back to the database, the retry will attempt to open a second PR. The guard catches this because draft_pr_url was written in a prior step that did complete.

A trickier case: what if the GitHub API call and the database write both complete, but the Inngest step's return value is lost before Inngest marks the step done? In that scenario the step body re-runs. The check-existing-pr guard at the top of the child catches the existing draft_pr_url and returns early. The GitHub API is never called twice for the same bug.

For a deeper look at the fix loop and CI integration, see the code-to-PR pipeline docs.

Gate 5 — RCA Write-Back Deduplication

The final stage publishes a customer-facing root-cause analysis back to the original support ticket. The write-back makes two external calls: it creates a row in rca_documents and posts a comment to the Zendesk or Intercom ticket.

The database insert is guarded with ON CONFLICT DO NOTHING keyed on bug_id:

INSERT INTO public.rca_documents (bug_id, body, publish_state, created_at)
VALUES ($1, $2, 'draft', now())
ON CONFLICT (bug_id) DO NOTHING;

The step checks the affected row count. If it is zero, an RCA already exists for this bug. The step skips the support-tool API call — no second comment is posted to the ticket. This is the same INSERT ON CONFLICT DO NOTHING pattern used across Watari's notification system: it is a reusable idempotency primitive that works correctly under concurrent retries because Postgres's conflict detection is atomic.

The publish_state column tracks whether the write-back to Zendesk or Intercom completed. If the database insert succeeds but the support-tool API call fails, the retry finds the rca_documents row already present (so it skips the insert) but reads publish_state = 'draft' (so it still attempts the API call). The two-column check — existence plus state — handles the partial-failure case without opening a duplicate RCA row.

The Invariant That Ties All Five Gates Together

Every gate above shares a single structural property: the side effect is guarded by a stable key that exists before the step runs. Ticket IDs come from the provider. Bug IDs are written before extraction runs. PR URLs are written before the step that publishes them completes. RCA bug_id foreign keys are known at fan-out time.

The failure mode that breaks idempotency is always the same: generating a new identifier — with crypto.randomUUID(), Date.now(), or an unguarded model call — inside a step body, then using that identifier as the key for a write that hasn't landed yet. If the step crashes after the identifier is generated but before the write completes, the retry generates a new identifier, bypasses every guard, and creates a duplicate.

The fix is always the same: derive the identifier from stable inputs, or read it from a prior completed step that wrote it to the database. Inngest's step memoization makes the second option cheap — a completed step's output is returned from the Inngest cache on every retry, not re-executed.

For teams building their own support-automation pipelines on Inngest — or evaluating Watari's approach — the five gates above map directly onto any pipeline that ingests webhooks, calls an AI model, fires a billing event, fans out to child jobs, and writes back to an external system. The primitives are generic: webhook dedup tables, conditional UPDATEs with RETURNING, atomic RPCs, fan-out child guards, and INSERT ON CONFLICT DO NOTHING. The discipline is keeping non-determinism out of step bodies entirely.

Get new posts in your inbox

Related posts

How Watari maps a support ticket to a file and function — pgvector, tree-sitter, and a reranking model