When Your App Is the Platform
There's a moment in the life of many SaaS products where the tables turn. You've been consuming webhooks from Stripe and GitHub for years. Now your own customers — other developers — are asking how they can react to events in your system.
Maybe you're building an e-commerce platform and merchants want to be notified when orders arrive. Maybe you run a data pipeline tool and users want to trigger downstream jobs when a sync completes. Maybe you're building a CRM and agencies want to push new contacts into their own systems in real time. In all these cases, the answer is the same: you need to send webhooks.
Building a webhook system is more involved than it first seems. It's not just "make an HTTP POST" — you need to manage subscriptions, sign requests, handle failures gracefully, and give your users a decent developer experience. Done well, it's one of the highest-leverage integrations you can offer. Done poorly, it erodes trust every time an event silently fails to deliver.
This guide walks through every layer of building a production-ready webhook system, from the database schema to the retry queue to the developer dashboard.
The Core Architecture
At a high level, every webhook system follows the same flow. An event happens in your system, you find all the subscriptions that care about that event type, you attempt delivery to each one, and you record the outcome.
Event fires in your system
A user places an order, a sync job completes, a record is created — whatever domain event matters to your product. This is the trigger.
Query subscriptions for that event type
Look up which of your users have registered a webhook endpoint that subscribes to this event type. A single event might fan out to multiple subscribers.
Enqueue delivery jobs
Don't make HTTP requests synchronously from your main request handler. Push each delivery attempt onto a job queue. This decouples the event trigger from the potentially slow/flaky HTTP POST.
Attempt delivery, record the result
Make the HTTP POST. Record whether it succeeded (2xx response within timeout) or failed. If it failed, schedule a retry with backoff.
Why the queue matters
If you make webhook HTTP requests synchronously — waiting for a response before continuing — a slow or unavailable subscriber endpoint will slow down or block your main application. The queue lets you fire and continue, with the delivery worker handling retries in the background independently of your app's response time.
Storing Webhook Subscriptions
A webhook subscription is just a record in your database: a URL to deliver to, which event types to deliver, a secret for signing, and a flag for whether it's active. Here's a minimal schema that covers the essentials:
-- PostgreSQL example CREATE TABLE webhook_subscriptions ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID NOT NULL REFERENCES users(id), url TEXT NOT NULL, events TEXT[] NOT NULL, -- e.g. ARRAY['order.created', 'order.fulfilled'] secret TEXT NOT NULL, -- per-subscription HMAC secret active BOOLEAN NOT NULL DEFAULT TRUE, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), last_error TEXT, -- most recent delivery failure message failure_count INT NOT NULL DEFAULT 0 ); CREATE INDEX idx_webhook_subs_user ON webhook_subscriptions(user_id); CREATE INDEX idx_webhook_subs_events ON webhook_subscriptions USING GIN(events);
A few things worth calling out here. The secret column holds a per-subscription signing secret — not a global secret shared across all subscribers. Each subscriber gets their own secret, which means if one is compromised, it doesn't affect anyone else. Generate it with a cryptographically random source: crypto.randomBytes(32).toString('hex').
The failure_count column tracks consecutive delivery failures. When it hits your threshold (say, 10 consecutive failures), you automatically set active = false and email the subscriber. You don't want to keep hammering a dead endpoint forever.
Constructing and Sending the Request
The actual HTTP POST is straightforward. You send a JSON body with the event data, a few standard headers, and a signature header so the recipient can verify it came from you.
The Content-Type header should be application/json. Always include it — some servers will reject requests without it, or try to parse the body as a form.
const crypto = require('crypto');
async function sendWebhook(subscription, eventType, payload) {
const timestamp = Math.floor(Date.now() / 1000);
const body = JSON.stringify({
id: crypto.randomUUID(), // unique event ID for idempotency
type: eventType,
timestamp,
data: payload,
});
const signature = computeSignature(body, timestamp, subscription.secret);
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 10_000); // 10s timeout
try {
const response = await fetch(subscription.url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-Webhook-Timestamp': String(timestamp),
'X-Webhook-Signature': signature,
'User-Agent': 'YourApp-Webhooks/1.0',
},
body,
signal: controller.signal,
});
clearTimeout(timeout);
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
return { success: true, statusCode: response.status };
} catch (err) {
clearTimeout(timeout);
return { success: false, error: err.message };
}
}The Fetch API is available natively in Node 18+ and handles the HTTP POST cleanly. Always set a timeout — a subscriber endpoint that hangs indefinitely will tie up your delivery worker. 10 seconds is a reasonable default; you can let subscribers configure this if needed.
Generating Signatures with HMAC-SHA256
Signing your webhook requests is non-negotiable. Without a signature, any malicious actor who discovers one of your users' webhook endpoints can send fake events to it. The HMAC-SHA256 pattern is the industry standard — it's what Stripe uses, what GitHub uses, and what you should use too.
The approach: hash the request body (combined with a timestamp) using the subscriber's secret as the key, and include the result in a request header. The subscriber recomputes the same hash and compares. If they match, the request is genuine and unmodified.
Including a timestamp in the signed payload prevents replay attacks — someone capturing one of your legitimate webhook deliveries and re-sending it to the same endpoint hours later. The subscriber should reject requests where the timestamp is more than a few minutes old.
const crypto = require('crypto');
// Compute HMAC-SHA256 signature
// Uses Node.js crypto module: https://nodejs.org/api/crypto.html
function computeSignature(body, timestamp, secret) {
// The signed string includes a timestamp prefix to prevent replays
const signedPayload = `${timestamp}.${body}`;
return crypto
.createHmac('sha256', secret)
.update(signedPayload, 'utf8')
.digest('hex');
}
// Verification code your subscribers should run (Node.js example)
function verifySignature(rawBody, timestampHeader, signatureHeader, secret) {
const now = Math.floor(Date.now() / 1000);
const timestamp = parseInt(timestampHeader, 10);
// Reject requests with timestamps more than 5 minutes old
if (Math.abs(now - timestamp) > 300) {
throw new Error('Webhook timestamp too old — possible replay attack');
}
const expected = computeSignature(rawBody, timestamp, secret);
// Constant-time comparison to prevent timing attacks
const expectedBuf = Buffer.from(expected, 'utf8');
const actualBuf = Buffer.from(signatureHeader, 'utf8');
if (expectedBuf.length !== actualBuf.length) {
throw new Error('Signature invalid');
}
if (!crypto.timingSafeEqual(expectedBuf, actualBuf)) {
throw new Error('Signature invalid');
}
return true;
}Note the use of crypto.timingSafeEqual() for the comparison. A regular string comparison (===) is vulnerable to timing attacks — an attacker can measure microsecond differences in response time to guess the correct signature one byte at a time. Constant-time comparison eliminates that attack vector. Always use it when comparing secrets.
Retrying Failed Deliveries with Exponential Backoff
Delivery will fail. Subscriber endpoints go down for deploys, get rate-limited, return 500 errors, or just timeout. Your webhook system needs to retry automatically so that transient failures don't result in permanently lost events.
Exponential backoff is the right pattern: wait longer between each retry so that you're not hammering an already-struggling endpoint. A common schedule is 1 minute, 5 minutes, 30 minutes, 2 hours, 8 hours — each attempt roughly 4-5x the previous wait. After all retries are exhausted, disable the subscription and notify the subscriber.
// Retry schedule in seconds
const RETRY_DELAYS = [60, 300, 1800, 7200, 28800]; // 1m, 5m, 30m, 2h, 8h
async function handleDeliveryFailure(deliveryAttempt, subscription, error) {
const attemptNumber = deliveryAttempt.attemptNumber; // 1-indexed
await db.deliveryAttempts.update(deliveryAttempt.id, {
status: 'failed',
error: error.message,
respondedAt: new Date(),
});
if (attemptNumber < RETRY_DELAYS.length) {
// Schedule the next retry
const delaySeconds = RETRY_DELAYS[attemptNumber - 1];
const nextAttemptAt = new Date(Date.now() + delaySeconds * 1000);
await queue.schedule('deliver-webhook', {
subscriptionId: subscription.id,
eventId: deliveryAttempt.eventId,
attemptNumber: attemptNumber + 1,
}, { delay: delaySeconds * 1000 });
console.log(
`Scheduled retry #${attemptNumber + 1} for subscription ${subscription.id}`,
`in ${delaySeconds}s (at ${nextAttemptAt.toISOString()})`
);
} else {
// Max retries exhausted — disable the subscription
await db.webhookSubscriptions.update(subscription.id, {
active: false,
lastError: error.message,
});
// Notify the subscriber that their endpoint has been disabled
await emailService.sendWebhookDisabledEmail(subscription.userId, {
url: subscription.url,
lastError: error.message,
});
}
// Track consecutive failures for dashboard display
await db.webhookSubscriptions.increment(subscription.id, 'failureCount');
}Don't retry on 4xx responses
A 4xx response means the subscriber's endpoint rejected the request deliberately — wrong URL, authentication issue, malformed payload. Retrying won't fix it. Only retry on 5xx errors, timeouts, and network failures. A 400 or 401 should be flagged as a configuration problem and surfaced to the subscriber immediately, not silently retried five times over eight hours.
Giving Developers a Good Experience
A webhook system is only as good as its developer experience. If your subscribers can't see what's being delivered, can't debug failures, and can't trigger test events without going through a full production flow — they'll hate integrating with you.
The most valuable thing you can add is a delivery log in your dashboard. Show the last N delivery attempts for each subscription: timestamp, event type, HTTP status, response body, and a "Redeliver" button. When something breaks, your subscriber should be able to look at your dashboard, see exactly what was sent and what came back, and figure out the problem without emailing your support team.
-- Delivery attempts log table CREATE TABLE webhook_delivery_attempts ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), subscription_id UUID NOT NULL REFERENCES webhook_subscriptions(id), event_id UUID NOT NULL, event_type TEXT NOT NULL, payload JSONB NOT NULL, -- the full body we sent attempt_number INT NOT NULL DEFAULT 1, status TEXT NOT NULL, -- 'pending', 'success', 'failed' http_status INT, -- response code we received response_body TEXT, -- first 1000 chars of response error TEXT, -- error message if network/timeout failure attempted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), responded_at TIMESTAMPTZ ); CREATE INDEX idx_delivery_sub_id ON webhook_delivery_attempts(subscription_id); CREATE INDEX idx_delivery_event_id ON webhook_delivery_attempts(event_id);
Add a "Send test event" button to your dashboard. When a developer registers a new endpoint, they shouldn't have to trigger a real event in your system to verify the connection works. Send a synthetic webhook.test event with a fake payload so they can confirm delivery and check their signature verification code right away.
Before any of that is useful, though, you need to make it easy for subscribers to understand what they're going to receive. Include the full JSON schema for each event type in your documentation, along with realistic example payloads. Developers should be able to paste an example payload into a tool like JsonFormatter.ai to inspect and explore the structure before writing a single line of handler code. The easier you make this, the faster developers will successfully integrate.
Writing Webhook Documentation That Actually Helps
Most webhook documentation is terrible. It says something like "we send a POST when events happen" and leaves the developer to figure out the rest by trial and error. Don't do that.
Look at how GitHub documents their webhooks as a benchmark — they document every event type with a complete example payload, list every field and its type, explain when each event fires, and provide code samples for verifying signatures in multiple languages. That's the bar.
At a minimum, your webhook documentation should cover:
The event catalog
Every event type you send, what triggers it, and a full example payload. Don't truncate the example — show the real structure including nested objects. Developers copy-paste from documentation; make it worth copying.
How to verify signatures
Show the exact algorithm and header names you use. Provide working code samples in at least Node.js, Python, and Ruby. Include the "use the raw body, not the parsed JSON" warning — it's the #1 gotcha and you should save your users the debugging session.
Your retry policy
Tell developers exactly how many times you'll retry, on what schedule, and what happens when you give up. They need to know this to design their own failure-handling logic — especially if they're using your events to trigger time-sensitive operations.
Idempotency guidance
Explicitly tell your subscribers that they may receive duplicate events and should handle that. Point them to the event ID field they can use for deduplication. Most developers won't think of this themselves until they see double-fulfillment in production.
Good documentation is part of the product. A webhook system that no one can figure out how to integrate is a webhook system that doesn't get used. The engineering time you spend on docs pays back many times over in reduced support load and faster customer integrations.