Skip to content

NATS JetStream Streams, Consumers, and Replay

NATS JetStream Streams, Consumers, and Replay

Section titled “NATS JetStream Streams, Consumers, and Replay”

This article focuses on how to use JetStream once it is deployed:

  1. Create streams with subjects and retention rules.
  2. Build consumers that acknowledge work explicitly.
  3. Replay old data for recovery or backfill.
  4. Write Go code that treats message delivery as stateful work.

If the deployment article is about keeping the broker healthy, this one is about using the durable message log correctly.

A JetStream stream is a named log of messages that match one or more subjects.

Typical use cases:

  1. Durable event fan-out.
  2. Queue-like job distribution.
  3. Audit and replay.
  4. Backfill after a consumer bug.

Create a stream that captures a subject hierarchy:

Terminal window
nats stream add EVENTS \
--subjects "events.>" \
--storage file \
--retention limits \
--max-msgs=-1 \
--max-age=72h

That configuration means:

  1. Store all events.* traffic.
  2. Keep data on disk.
  3. Retain messages for 72 hours.

Design your subjects so the stream can absorb a logical family of events:

  1. events.orders.created
  2. events.orders.paid
  3. events.orders.shipped

Avoid overloading a stream with unrelated subject trees.

JetStream consumers come in two broad forms:

  1. Push consumers deliver messages to a subscription.
  2. Pull consumers let the application fetch messages when ready.

Push consumers are convenient when you want the broker to drive delivery. Pull consumers are better when you want the application to control batching and backpressure.

Example durable push consumer:

Terminal window
nats consumer add EVENTS orders-worker \
--deliver all \
--ack explicit \
--replay instant \
--filter "events.orders.>"

Example pull consumer:

Terminal window
nats consumer add EVENTS orders-batch \
--deliver all \
--ack explicit \
--replay instant \
--pull

Use pull consumers when:

  1. You need batch processing.
  2. The downstream system has tight rate limits.
  3. You want the worker to control concurrency.

Consumer acknowledgements are the core reliability mechanism.

The workflow is simple:

  1. Consumer receives a message.
  2. Consumer processes it.
  3. Consumer acks only after durable side effects complete.
  4. If the ack never arrives, JetStream redelivers.

This makes consumers safe to restart, but it also means handlers must be idempotent.

Ack handling in Go:

package main
import (
"context"
"log"
"time"
"github.com/nats-io/nats.go"
)
func consumeOrders(ctx context.Context, nc *nats.Conn) error {
js, err := nc.JetStream()
if err != nil {
return err
}
msgs, err := js.PullSubscribe(
"events.orders.>",
"orders-batch",
nats.BindStream("EVENTS"),
)
if err != nil {
return err
}
for {
batch, err := msgs.Fetch(10, nats.Context(ctx), nats.MaxWait(5*time.Second))
if err != nil {
return err
}
for _, msg := range batch {
if err := processOrder(msg.Data); err != nil {
_ = msg.Nak()
continue
}
if err := msg.Ack(); err != nil {
return err
}
}
}
}
func processOrder(data []byte) error {
log.Printf("process order event: %s", string(data))
return nil
}

Notice the ordering:

  1. Process first.
  2. Ack second.
  3. Nak on recoverable failures.

If the worker crashes after processing but before acking, the message can come back. That is expected.

JetStream replay is what makes a stream useful after a defect or deployment mistake.

Common replay modes:

  1. Replay everything from the beginning.
  2. Replay only from a time window.
  3. Replay only for a specific subject filter.

That makes recovery workflows practical:

  1. Deploy a fix.
  2. Rewind the consumer or create a new one.
  3. Reprocess from the stream.
  4. Compare output counts and spot gaps.

Backfill is especially useful when:

  1. A consumer wrote bad data downstream.
  2. A new index or materialized view must be rebuilt.
  3. A service needs to rebuild a cache from historical events.

If you need a one-off backfill, create a separate consumer instead of rewinding the production consumer blindly.

Producers only need to know the subject and the payload.

package main
import (
"context"
"encoding/json"
"github.com/nats-io/nats.go"
)
type OrderCreated struct {
OrderID string `json:"order_id"`
UserID string `json:"user_id"`
Total int64 `json:"total_cents"`
Currency string `json:"currency"`
}
func publishOrderCreated(ctx context.Context, nc *nats.Conn, event OrderCreated) error {
_ = ctx
js, err := nc.JetStream()
if err != nil {
return err
}
payload, err := json.Marshal(event)
if err != nil {
return err
}
_, err = js.Publish("events.orders.created", payload)
return err
}

The producer should not decide how the message is consumed. It should only publish a clear event to a stable subject.

Pull consumers are a good default for batch-oriented processing.

package main
import (
"context"
"time"
"github.com/nats-io/nats.go"
)
func runBatchWorker(ctx context.Context, nc *nats.Conn) error {
js, err := nc.JetStream()
if err != nil {
return err
}
sub, err := js.PullSubscribe(
"events.orders.>",
"orders-batch",
nats.BindStream("EVENTS"),
nats.AckExplicit(),
nats.MaxAckPending(1000),
)
if err != nil {
return err
}
for {
msgs, err := sub.Fetch(50, nats.Context(ctx), nats.MaxWait(10*time.Second))
if err != nil {
return err
}
for _, msg := range msgs {
if err := handleOrderEvent(msg.Data); err != nil {
_ = msg.Nak()
continue
}
if err := msg.Ack(); err != nil {
return err
}
}
}
}
func handleOrderEvent(data []byte) error {
_ = data
return nil
}

Operationally, the important knobs are:

  1. Fetch batch size.
  2. MaxAckPending.
  3. Ack timeout.
  4. Retry and backoff behavior.

Tune those based on downstream latency, not just broker capacity.

JetStream gives you delivery guarantees, not business-level exactly-once semantics.

Design consumers so repeated delivery is harmless:

  1. Use event IDs.
  2. Track processed IDs in a durable store.
  3. Make writes idempotent.
  4. Use upserts when appropriate.

Ordering is also subtle:

  1. Stream order is preserved within the log.
  2. Parallel consumers can process out of order.
  3. A redelivery may arrive after newer messages are already handled.

That means consumers should rely on event timestamps or sequence numbers when strict ordering matters.

Useful patterns for production:

  1. Use one stream per domain or lifecycle boundary.
  2. Keep dead-letter subjects separate from mainline traffic.
  3. Create durable consumers for long-lived workers.
  4. Use ephemeral consumers for ad hoc inspection.
  5. Add replay scripts for recovery jobs.

Examples:

  1. orders.events.v1
  2. orders.audit.v1
  3. orders.deadletter.v1

Avoid:

  1. One catch-all stream for unrelated systems.
  2. Ad hoc consumers created by every pod startup.
  3. No retention policy.
  4. No accounting for duplicate deliveries.

JetStream is straightforward when you keep the model honest:

  1. Streams are durable logs.
  2. Consumers track acknowledgement state.
  3. Replay is a recovery tool.
  4. Idempotency is mandatory.

If you want durable messaging with operationally simple semantics, JetStream is a strong fit. If you want workflow orchestration or transactional processing, use something else.