6 min read

Type‑Safe Distributed Cron Jobs with Temporal.io & TypeScript: From Definition to Fault‑Tolerant Execution

Learn how to model, schedule, and monitor reliable distributed cron jobs using Temporal.io’s TypeScript SDK, with real‑world code and failure‑handling patterns.
Type‑Safe Distributed Cron Jobs with Temporal.io & TypeScript: From Definition to Fault‑Tolerant Execution

Introduction

Scheduled background work is a staple of modern back‑ends: nightly data imports, periodic email digests, cache warm‑ups, or recurring cleanup tasks.
When those jobs run on a single server they’re easy to reason about, but in a micro‑service world they become fragile:

  • Node crashes or container restarts can silently drop a pending job.
  • Network partitions may cause duplicate executions.
  • Complex dependency graphs (job A must finish before job B runs) are hard to keep consistent.

Temporal.io solves these problems by turning workflows into durable state machines that survive process failures, scale horizontally, and expose a clean TypeScript‑first API. This article walks through building a type‑safe distributed cron system with Temporal’s TypeScript SDK, focusing on:

  1. Defining a typed workflow that represents a recurring job.
  2. Registering the workflow as a cron activity in Temporal.
  3. Handling failures with retries, heartbeating, and compensation.
  4. Observability hooks (metrics & logs) that keep ops teams in the loop.

No marketing fluff—just practical patterns you can copy into a production codebase.


1. Temporal Basics Refresher

Concept What it is Why it matters for cron jobs
Workflow A deterministic state machine written in TypeScript. Guarantees exactly‑once execution semantics even across crashes.
Activity Imperative code that can perform I/O, call external services, etc. Encapsulates the actual “do the work” part of a cron job.
Task Queue Logical channel that workers poll for work. Lets you scale workers independently of the scheduler.
Cron Schedule A cron‑like expression attached to a workflow start request. Turns any workflow into a recurring job without external schedulers.

Temporal persists workflow state in its own replicated database, so the engine can replay a workflow after a failure without losing progress. This is the foundation for reliable cron execution.


2. Setting Up the Project

# Create a fresh Node project
npm init -y
npm i @temporalio/workflow @temporalio/worker @temporalio/client zod

# Optional: install a logger
npm i pino
Tip: All Temporal SDK packages are pure TypeScript, so you get full type inference out of the box.

Create the following folder layout:

src/
├─ activities/
│   └─ emailDigest.ts
├─ workflows/
│   └─ nightlyDigest.ts
├─ worker.ts
└─ client.ts

3. Defining a Typed Activity Interface

First, we model the input and output of our job with Zod schemas. This gives runtime validation and a corresponding TypeScript type.

// src/activities/emailDigest.ts
import { z } from "zod";

export const EmailDigestInput = z.object({
  userId: z.string().uuid(),
  date: z.string().regex(/^\d{4}-\d{2}-\d{2}$/), // YYYY-MM-DD
});

export type EmailDigestInput = z.infer<typeof EmailDigestInput>;

export async function sendDigest(input: EmailDigestInput): Promise<void> {
  // Simulate I/O – in reality you’d query a DB, render a template, call SES, etc.
  console.log(`📧 Sending digest for ${input.userId} on ${input.date}`);
  // Throw to trigger retries
  if (Math.random() < 0.2) {
    throw new Error("Transient email service failure");
  }
}

The exported EmailDigestInput type is type‑safe across the whole codebase. Any attempt to call sendDigest with a malformed payload will be caught at compile‑time.


4. Writing the Workflow

A workflow cannot call arbitrary code directly; it must invoke activities. The SDK provides a proxy that records the call for deterministic replay.

// src/workflows/nightlyDigest.ts
import { defineSignal, defineQuery, setHandler } from "@temporalio/workflow";
import type * as activities from "../activities/emailDigest";
import { EmailDigestInput } from "../activities/emailDigest";

export const startSignal = defineSignal<void>("start");
export const statusQuery = defineQuery<string>("status");

export const nightlyDigest = defineWorkflow<
  // Input type for the workflow start
  { users: string[]; date: string }
>("nightlyDigest", async function (args) {
  const { users, date } = args;
  const { sendDigest } = new Proxy<typeof activities>(undefined, {
    get(_, prop) {
      return async (input: EmailDigestInput) => {
        // Temporal activity call – retries & timeout are declared here
        return await activities[prop as keyof typeof activities](
          input,
          {
            startToCloseTimeout: "1 minute",
            retry: {
              maximumAttempts: 5,
              backoffCoefficient: 2,
            },
          }
        );
      };
    },
  });

  // Expose a simple status query
  let completed = 0;
  setHandler(statusQuery, () => `Completed ${completed}/${users.length}`);

  // Optional: allow an external signal to abort early
  let aborted = false;
  setHandler(startSignal, () => {
    aborted = true;
  });

  for (const userId of users) {
    if (aborted) break;

    const input: EmailDigestInput = { userId, date };
    await sendDigest(input); // Activity call with built‑in retries
    completed++;
  }
});

Key points

  • The workflow is generic – it receives a list of user IDs and a date, making it reusable for any batch size.
  • Activity options (startToCloseTimeout, retry) are declared inline, giving compile‑time guarantees that every call respects the same SLA.
  • Signals and queries let you probe or control a running workflow without breaking determinism.

5. Registering the Cron Schedule

Temporal lets you start a workflow once with a cron expression; the engine automatically creates future runs.

// src/client.ts
import { Connection, WorkflowClient } from "@temporalio/client";
import { nightlyDigest } from "./workflows/nightlyDigest";

async function scheduleNightlyDigest() {
  const connection = await Connection.connect(); // defaults to localhost:7233
  const client = new WorkflowClient({ connection });

  // Register the cron schedule (runs at 02:00 UTC every day)
  await client.start(nightlyDigest, {
    taskQueue: "digest-queue",
    workflowId: "nightly-digest",
    args: {
      users: await fetchAllUserIds(), // your DB call
      date: new Date().toISOString().slice(0, 10),
    },
    cronSchedule: "0 2 * * *",
    // Optional: skip if a previous run is still pending
    memo: { lastRun: Date.now() },
    // Catch‑up policy – if the server was down for 3 days, run 3 missed executions
    cronScheduleOptions: { catchupWindow: "72h" },
  });

  console.log("✅ Nightly digest cron registered");
}

scheduleNightlyDigest().catch(console.error);

cronSchedule follows the standard POSIX syntax. Temporal guarantees at‑most‑once execution per schedule tick, even if workers restart.


6. Running Workers

Workers poll the task queue, execute activities, and replay workflows when needed.

// src/worker.ts
import { Worker } from "@temporalio/worker";
import * as activities from "./activities/emailDigest";
import { nightlyDigest } from "./workflows/nightlyDigest";

async function run() {
  const worker = await Worker.create({
    workflowsPath: require.resolve("./workflows/nightlyDigest"),
    activities,
    taskQueue: "digest-queue",
    // Concurrency control – prevent overloading the email provider
    maxConcurrentActivityTaskExecutions: 5,
    // Heartbeat timeout protects against hung activities
    activityHeartbeatTimeout: "30 seconds",
  });

  await worker.run();
}

run().catch((err) => {
  console.error(err);
  process.exit(1);
});

Why this matters

  • Deterministic replay ensures that if a worker crashes after sending half the emails, the workflow state is restored and the remaining users are processed without duplication.
  • Activity heartbeating lets Temporal detect hung email sends (e.g., network dead‑ends) and retry automatically.

7. Failure Handling Patterns

7.1 Retries & Backoff

The activity options we set (maximumAttempts, backoffCoefficient) give us exponential back‑off out of the box. For non‑idempotent steps, wrap them in a compensation activity.

// src/activities/emailDigest.ts
export async function compensateSendDigest(input: EmailDigestInput): Promise<void> {
  console.log(`🛑 Compensation: mark digest as unsent for ${input.userId}`);
  // e.g., write a flag to a DB so the next run can retry safely
}

In the workflow:

await sendDigest(input).catch(async (err) => {
  await compensateSendDigest(input);
  throw err; // re‑throw to let the workflow fail and be retried globally
});

7.2 Timeout‑Based Cancellation

If a run consistently exceeds its window (e.g., the email provider is down for an hour), you may want to cancel the workflow.

import { CancellationScope } from "@temporalio/workflow";

await CancellationScope.run(async () => {
  // 30‑minute hard deadline for the whole batch
  CancellationScope.current().cancelAfter(30 * 60 * 1000);
  // ...activity loop here
});

7.3 Monitoring & Alerts

Temporal emits visibility events (STARTED, COMPLETED, FAILED). You can forward them to Prometheus, Datadog, or any observability stack via the Telemetry SDK.

// worker.ts (partial)
import { OpenTelemetryInterceptor } from "@temporalio/interceptors-opentelemetry";

const worker = await Worker.create({
  // …
  interceptors: {
    workflowModules: [OpenTelemetryInterceptor],
    activityModules: [OpenTelemetryInterceptor],
  },
});

Now every activity start/end appears as a span, and failures surface as error events.


8. Testing the Cron Workflow Locally

Temporal provides an in‑memory test server that makes unit‑testing straightforward.

// src/__tests__/nightlyDigest.test.ts
import { TestWorkflowEnvironment } from "@temporalio/testing";
import { nightlyDigest } from "../workflows/nightlyDigest";

test("runs all users exactly once", async () => {
  const env = await TestWorkflowEnvironment.createLocal();
  const { client, worker } = env;

  await worker.runUntil(async () => {
    const handle = await client.start(nightlyDigest, {
      taskQueue: "test-queue",
      workflowId: "test-digest",
      args: { users: ["a","b","c"], date: "2024-01-01" },
    });

    await handle.result(); // wait for completion
    const status = await handle.query("status");
    expect(status).toBe("Completed 3/3");
  });

  await env?.teardown();
});

The test validates exact‑once semantics without spinning up a real Temporal cluster.


9. Scaling Considerations

Concern Temporal Feature Practical Tip
Throughput Multiple workers on the same task queue Deploy workers as a Kubernetes Deployment with replicas: N.
Rate limits Activity throttling via maxConcurrentActivityTaskExecutions Tune per‑activity based on third‑party API quotas.
Geographic isolation Namespaces & task queues per region Keep a separate namespace for EU‑GDPR‑bound jobs.
Version upgrades Workflow versioning (patch & continueAsNew) Use continueAsNew after processing >10k items to keep history size bounded.

10. Recap & Checklist

  • Define input schemas with Zod (or Yup) → compile‑time + runtime safety.
  • Write pure workflows that call activities via the generated proxy.
  • Attach a cron schedule when starting the workflow; Temporal guarantees at‑most‑once execution.
  • Configure retries, timeouts, and heartbeats on activities to survive transient failures.
  • Add compensation activities for non‑idempotent steps.
  • Instrument with OpenTelemetry for visibility.
  • Test with the in‑memory server to ensure deterministic behavior.

By embracing Temporal’s deterministic model and TypeScript’s type system, you get a distributed cron platform that is:

  1. Reliable – no missed runs, no duplicate executions.
  2. Observable – every step is traceable and metric‑driven.
  3. Scalable – add workers, adjust concurrency, and let Temporal handle the coordination.

Give it a spin in a sandbox project, then migrate your existing node‑cron or agenda jobs to Temporal—your future self will thank you when the next outage hits.