Skip to main content

Ensuring your webhook handler processes events exactly once

Webhooks get delivered more than once. Network timeouts, server restarts, and retry logic all conspire to send you the same event multiple times. If your handler charges a customer, sends an email, or updates inventory, processing a duplicate can cause real problems.

Idempotency is the solution. An idempotent operation produces the same result whether you run it once or ten times. Building idempotent webhook handlers means duplicates become harmless rather than catastrophic.

This article covers why duplicates happen, how to detect them, and patterns for building handlers that safely ignore repeated deliveries.

Why webhooks arrive more than once

Most webhook providers guarantee at-least-once delivery, which means they will retry until they receive a successful response. This is the right tradeoff because losing events is usually worse than receiving duplicates. But it creates duplicates in several scenarios.

Your server might process an event successfully but crash before returning a 200 response. The provider sees no response, assumes failure, and retries. You now have two deliveries of the same event, and your server processed both.

Network issues can cause similar problems. A response might get lost in transit, or a load balancer might time out while your server is still working. The provider retries, and again you receive the same event twice.

Some providers also send duplicates intentionally. If a delivery fails and later succeeds on retry, some systems still send additional retries that were already queued. Race conditions in distributed retry systems can produce the same effect.

The bottom line is that any webhook handler in production will eventually receive duplicates. The question is not whether to handle them, but how.

Detecting duplicates with event IDs

Every well-designed webhook includes a unique event ID. This identifier stays the same across retries, which is what distinguishes a retry from a genuinely new event. Your first line of defense is to track which event IDs you have already processed.

The simplest approach is to store event IDs in a database before processing. When a webhook arrives, check if its ID exists in your processed events table. If it does, return a 200 immediately without doing any work. If it does not, insert the ID and then process the event.

def handle_webhook(event):
event_id = event["id"]

# Check if already processed
if db.query("SELECT 1 FROM processed_events WHERE event_id = %s", event_id):
return {"status": "already processed"}, 200

# Mark as processing before doing work
db.execute("INSERT INTO processed_events (event_id, created_at) VALUES (%s, NOW())", event_id)

# Now safe to process
process_event(event)
return {"status": "ok"}, 200

This approach has a race condition. Two duplicate requests arriving simultaneously might both pass the existence check before either inserts. Use a unique constraint on the event ID column and handle the constraint violation as a duplicate.

def handle_webhook(event):
event_id = event["id"]

try:
db.execute("INSERT INTO processed_events (event_id) VALUES (%s)", event_id)
except UniqueViolation:
return {"status": "duplicate"}, 200

process_event(event)
return {"status": "ok"}, 200

The insert acts as both a check and a claim. Whoever inserts first wins, and everyone else sees the constraint violation.

Making your business logic idempotent

Tracking event IDs catches exact duplicates, but it does not protect against all problems. What if you process an event, fail partway through, and then receive a retry? The event ID is already recorded, but the work is incomplete.

The deeper solution is to make your business logic itself idempotent. Each operation should check whether it has already been applied before applying it.

For creating resources, use the event ID as part of the primary key or as a unique constraint. If a payment event creates an order, make the event ID the order's idempotency key. Trying to create the order again fails harmlessly.

For updating resources, use conditional updates. Instead of "set balance to 50", write "set balance to 50 if current balance is 100". The first update succeeds, subsequent duplicates find the condition no longer matches and do nothing.

For external side effects like sending emails, record that you sent the email as part of the same transaction that marks the event processed. If sending fails, the transaction rolls back and retries can try again. If sending succeeds but recording fails, you might send twice, which is why critical emails should also have their own idempotency layer.

Cleaning up old event IDs

Your processed events table will grow indefinitely if you never delete from it. Most webhook providers stop retrying after a few days, so you can safely delete event IDs older than that window.

A daily job that deletes records older than seven days is usually sufficient. If you need to be more conservative, keep them for thirty days. The storage cost is minimal compared to the protection they provide.

For high-volume systems, consider using a time-based data structure like a Bloom filter or a Redis set with TTL. These provide probabilistic deduplication with automatic expiration, trading a tiny chance of false positives for better performance and no cleanup overhead.

When deduplication is not enough

Some events are not safe to deduplicate by ID alone. Consider an event stream where order of processing matters. If you receive events 1, 2, and 3, then receive a duplicate of event 1, simply ignoring the duplicate is correct. But what if you receive events 1, 3, 2 due to network reordering? Processing them in arrival order might violate business rules.

For these cases, you need to track not just which events you have seen but also sequence information. Many providers include a sequence number or timestamp that indicates the canonical order. Your handler should process events in sequence order, buffering out-of-order arrivals until earlier events appear.

This adds significant complexity. Before implementing it, consider whether your business logic truly requires strict ordering or whether eventual consistency is acceptable.