Ops
Cron And Jobs

Cron & Background Jobs

How scheduled work runs in Eziseller. Audience: new dev with Node experience, no Eziseller context.

1. Overview

Eziseller runs a single long-lived Express process that serves HTTP and hosts all scheduled work in-process via node-cron (opens in a new tab). There is no separate worker, no queue, and no Redis. When backend/server.ts boots, it registers one cron job (subscription expiry, every 5 minutes) and one setInterval loop (notification queue, every 30 seconds). Jobs are plain async functions in backend/jobs/ that read/write Postgres directly. This is intentionally simple — it works for a single-instance Azure App Service deployment but does not scale horizontally without changes (see Gotchas).

2. Architecture

The Express process, cron scheduler, and interval loop all share one Node event loop and one Prisma client.

3. Active schedules

NameCadenceTriggerHandler
Subscription expiry*/5 * * * * (every 5 min)node-cronupdateExpiredSubscriptions()
Notification queue drain30 ssetIntervalnotificationService.processNotifications(10)

Only the subscription cron is a "real" cron (via node-cron). The notification processor is a naive setInterval — see server.ts:L207-L216.

4. Jobs

JobFileWired in?Callers
updateExpiredSubscriptionsupdate-expired-trials.tsYes, via cronCron tick + admin manual trigger
sendTrialReminderssend-trial-reminders.tsNo — orphanOnly require.main === module (CLI)

sendTrialReminders exists and works when invoked directly (node send-trial-reminders.js) but is not registered with the scheduler. Nothing runs it in production. See Gotchas.

5. Subscription cron tick

Both expiry queries run in parallel. Each matching row is updated in its own prisma.update call (no batching, no transaction). Errors on individual rows are collected and returned in JobResult.errors; they do not stop subsequent rows. See update-expired-trials.ts:L229-L270.

6. Key files

7. Env vars & config

VarRequiredPurposeWhat breaks
TZnoIANA timezone passed to node-cron; defaults to UTCCron still runs, but logs and any timezone-sensitive logic use UTC. On Azure App Service the container default is UTC regardless of region.
DATABASE_URLyesPrisma connection for the jobCron ticks throw on every run, errors swallowed by the outer try/catch in subscription-cron.ts:L68.
FRONTEND_URLnoUsed in (stubbed) expiration email templatesLinks in emails would be broken — currently emails are TODO, so no real impact.

8. Gotchas & troubleshooting

  • Crashed server stops all crons. One process hosts HTTP + cron + notification loop. If the Node process dies and Azure restarts it, any tick that was due during downtime is lostnode-cron does not replay missed schedules. The job is idempotent (it queries < now), so the next tick will catch up, but expiries can be delayed by whatever the downtime was.
  • sendTrialReminders is not wired up. send-trial-reminders.ts is fully written but nothing in schedulers/ registers it and server.ts never calls it. Trial users currently get no pre-expiry reminder emails in production. Fix: add a second cron.schedule (daily at e.g. 0 9 * * *) in a new scheduler or in subscription-cron.ts.
  • Long-running jobs block HTTP. Both the cron tick and the notification loop run on the main event loop. A slow Prisma query in updateExpiredSubscriptions will stall incoming API requests. Per-row update calls in a for loop (not a transaction) make this worse as the tenant count grows.
  • No retry on failure. If an individual prisma.update inside the job throws, the error is pushed into results.errors and logged — it is not retried. If the whole tick throws, it is caught and logged in subscription-cron.ts:L68-L70; the next tick fires 5 minutes later and picks up any rows that are still < now.
  • No distributed lock. If you ever run more than one backend instance (Azure App Service scale-out > 1), every instance will run the cron, and both will race to update the same expired rows. Prisma's update will succeed on both (last write wins) — no data corruption, but log noise and duplicated work. For catalog unpublishing the updateMany is idempotent, so this is currently safe but fragile.
  • getSubscriptionCronStatus().nextRun is a lie. It returns now + 5 minutes, not the actual next tick time from node-cron. Don't trust it for debugging timing issues — check the actual execution log timestamps instead. See subscription-cron.ts:L97-L106.
  • SIGINT/SIGTERM handlers call process.exit(0). Registered inside the scheduler module. This is fine in practice but means other graceful-shutdown handlers after cron shutdown won't run if the cron's handler fires first.

9. Extension points

  • Add a new cron: create backend/schedulers/<name>-cron.ts following the shape of subscription-cron.ts, then call its .start() from backend/server.ts after subscriptionCron.start(). Use the same timezone: process.env.TZ || 'UTC' option.
  • Wire up trial reminders: add cron.schedule('0 9 * * *', sendTrialReminders, { timezone: process.env.TZ || 'UTC' }) — once a day at 09:00 server time matches the 1-hour window the job uses internally.
  • Manual trigger from admin UI: mirror the pattern in routes/admin/index.ts:L36-L54 — expose POST /api/admin/jobs/<name> that imports and invokes the job function directly.
  • When to move to BullMQ/Redis: if any of (a) horizontal scale > 1 instance, (b) jobs > a few seconds, (c) you need retries/backoff, or (d) you need missed-tick catch-up. Replace node-cron + in-process jobs with a BullMQ queue backed by Redis and a separate worker process deployed alongside the API.

10. Related docs