From Celery to Inngest: Why We Migrated Our Task Execution Layer
The task execution layer quietly makes or breaks your product. Here's how we moved from Celery to Inngest, what we gained, and the tradeoffs we accepted along the way. A deep-dive companion to Part 1 of The Stack We Actually Ship On.
This is a deep-dive companion to Part 1: Infrastructure of our engineering stack series. It covers the full story of our migration from Celery to Inngest for durable task execution.
This is the layer that quietly makes or breaks your product. The user uploads a document. What happens next? If “what happens next” involves five services, two queues, and careful coordination, you’re going to face reliability challenges at scale.
For us, this layer is the backbone of the entire product. A sales call ends, a transcript lands in GCS , and then a cascade of processing kicks off: transcription, signal extraction, causal analysis, coaching generation, CRM sync. If any step in that chain is fragile, the Post-Meeting Agent doesn’t have context when it calls the rep back for a debrief, or the Pre-Meeting Agent goes into the next conversation blind. These are agents that talk to people, and people notice immediately when an agent doesn’t know what it should know.
Locked In
Why We Started With Celery
When we started building AmpUp, the choice of background task queue was a non-decision. Celery is battle-tested, has a massive community, and was made even more popular by Apache Airflow adopting it as its default execution backend. For a Python shop building FastAPI services, it’s just what you reach for. We wired up Redis as our broker and backend, decorated a few functions as background tasks, and shipped. For the first year, it worked fine.
Then our workflows got complicated. The first symptom was debuggability, or the complete lack of it. A task would get dispatched into Redis and then… something would happen. Maybe it completed. Maybe it failed. Maybe the worker died mid-execution from an OOM kill on Kubernetes and the task evaporated silently. Flower, Celery’s monitoring UI, is technically functional but practically useless for anything beyond “how many tasks are queued right now.” When an engineer got paged at 2am because a coaching report never generated, the investigation was: grep through logs, cross-reference task IDs, piece together what happened from fragmented output. There was no single place to see a specific execution’s full lifecycle.
The deeper problem was Celery’s chain and chord patterns. We leaned on these for multi-step pipelines: transcribe audio, then extract topics, then score against a rubric, then generate a summary, then notify the rep. In theory, Celery chains compose elegantly. In practice, one timeout in step three produces an error that surfaces at step five with a cryptic message. Celery has no concept of durable execution: if a worker process dies mid-task, that unit of work is gone unless you built your own idempotency layer. We did build some of that infrastructure, and maintaining it was its own ongoing tax. On top of that, Celery’s built-in UI layer, Flower , is not very user-friendly: it shows you queue depth and task counts, but tracing a specific execution’s full lifecycle through retries and failures is painful.
Why We Moved to Inngest
We migrated gradually over about three months, starting with new workflows rather than ripping out existing ones. The biggest immediate win was the dashboard. Every function execution has a full audit trail: the triggering event payload, each step’s input and output, timing, retry history, and failure reason. When a coaching pipeline fails now, the on-call engineer opens the Inngest UI, finds the execution, and sees exactly which step failed. That alone was worth the migration cost.
Another underappreciated win: the Inngest Dev Server keeps both Inngest Cloud and self-deployment options open. In production, we run against Inngest Cloud. In local development, the dev server runs inside each developer’s namespace. And for testing, we spin up the dev server inside a pytest fixture, which means our integration tests exercise the full Inngest event-driven pipeline, including retries and step execution, entirely within pytest. No mocks, no stubs, just the real execution engine running locally.
Inngest also gives us fine-grained throttling out of the box, which directly addresses the noisy-neighbor problem in a multi-tenant system. We can set concurrency limits and rate caps per function, per tenant, or per any custom key. When one enterprise customer uploads 500 recordings at once, their processing is throttled so it doesn’t starve other tenants’ pipelines. With Celery, we would have had to build this prioritization and isolation layer ourselves.
Fewer things to operate always wins. Your ops burden compounds faster than your headcount.
Step Functions and Durable Execution
The step function model is where Inngest earns its keep. A meeting upload triggers a pipeline: transcribe audio (AssemblyAI ) -> diarize speakers -> extract sales signals -> generate per-rep coaching -> sync to CRM. Each of these is a step.run() block, independently persisted and independently retryable. If the CRM sync fails because Salesforce is having a moment, Inngest retries just that step with exponential backoff. You’re not re-running a fifteen-minute pipeline from scratch because the last ten seconds failed.
The Fan-Out Pattern
A single meeting/uploaded event doesn’t just kick off one pipeline; it triggers a constellation of independent functions in parallel.
Transcription starts immediately. A metadata extraction function pulls context from the CRM. A notification function pings the rep’s Slack. A billing function logs usage. None of these functions know about each other; they all just listen for
meeting/uploaded and do their thing. Adding a new behavior on upload means writing one new Inngest function and pointing it at the existing event. No modifications to the upload handler.
Real-Time UI from Event Streams
One of the underappreciated wins with Inngest is how naturally it wires into real-time UI. When a sales rep uploads a meeting recording, they don’t want to stare at a spinner for three minutes. They want to see something happening: “Transcribing…”, “Extracting signals…”, “Generating coaching feedback…” Each step in the pipeline fires an Inngest event when it completes (meeting/transcription.completed, meeting/signals.extracted), and the frontend subscribes to a lightweight SSE endpoint that forwards those events. The UI reflects exactly what the durable execution engine knows, because they’re reading from the same event stream.
Retry and Failure Strategy
Inngest functions get 3 retries with exponential backoff by default. We handle idempotency at two layers: Inngest’s built-in event deduplication, and application-level instance filtering that prevents cross-environment duplicate processing. Throttling uses the Generic Cell Rate Algorithm (GCRA); for example, transcript processing runs at 2 concurrent / 5 per minute, while analysis runs at 5 concurrent / 10 per minute.
We don’t use a traditional dead-letter queue. Failed tasks land in a database table with full error details, status is marked
FAILED, and a three-layer middleware stack logs the error with context, records the status transition, and forwards everything to Sentry . Simple, debuggable, and good enough until you need something fancier.
The Honest Tradeoffs
Celery is genuinely better for high-volume, simple fire-and-forget work where you need raw throughput and minimal overhead. We kept Celery for a handful of high-throughput, low-complexity jobs. Inngest adds HTTP round-trip overhead per step, invisible at human-timescale workflows, but it matters for tight loops over large datasets. But for the complex, multi-step, stateful workflows that are the core of our AI agent infrastructure, Inngest removed an entire category of reliability problems we had accepted as part of the job.
← Back to Part 1: The Infrastructure
Deep dive from The Stack We Actually Ship On. Written by Rahul Balakavi, for founders who’ve been there.
Build With AmpUp
Interested in our engineering approach? Explore careers at AmpUp or book a technical demo to see how our stack powers AI sales coaching at scale.
Frequently Asked Questions
Q: When should you migrate from Celery to a durable execution platform like Inngest?
Migrate when your workflows become multi-step, failure modes become costly, or observability becomes critical. Single-task fire-and-forget work (high-volume, low-complexity) stays on Celery; we kept it for that. But when reps depend on a five-step pipeline (transcribe → extract signals → generate coaching → sync CRM) and a failure mid-pipeline blocks features, you need durable execution. Inngest’s step-level persistence and full audit trail saved us from the Celery problem: a worker dying mid-task, the pipeline evaporating, and engineers grepping logs at 2am to piece together what happened.
Q: How much overhead does Inngest add per function execution compared to Celery?
Inngest adds HTTP round-trip latency per step (typically 50-200ms depending on network and LLM provider latency), which is invisible for human-scale workflows. If you’re processing millions of tasks per minute with sub-second requirements, Celery is faster. For sales workflows where steps naturally involve I/O (API calls, LLM inference), the latency is negligible compared to the actual work. The real win is that you lose an entire category of operational problems: no Redis management, no worker process crashes, no lost tasks, and full observability in the Inngest dashboard instead of grep logs.
Q: Can you run Inngest in a self-hosted environment or are you locked into their cloud?
Inngest offers both managed cloud and self-deployment options. In production we use Inngest Cloud; in local development we use their dev server. That flexibility keeps you from lock-in while giving you the benefits of a managed service. If Inngest ever changes pricing or direction, we have a backup path. For most teams, the managed offering is worth the dependency trade-off because the alternative—maintaining a durable task execution layer yourself—is a bigger engineering tax than staying with a vendor.
Rahul Balakavi is the co-founder of AmpUp. He leads engineering and product, bringing deep expertise in building AI-powered platforms that turn sales data into actionable intelligence.