Configuring Buffering and Flush Behavior in JS Client



What Buffering Means in a JS Client

When a JavaScript client records a metric, log entry, or analytics event, it rarely sends a network request right away. Instead, it places the item in an in-memory queue. That queue is the buffer.

The reason is straightforward. If you record 20 metrics during a single request handler, making 20 separate HTTP calls would be wasteful. The client holds them, groups them into a batch, and sends one request later. Segment’s Node library documentation describes this explicitly: every method call does not result in an HTTP request because messages are queued in memory and flushed in the background for faster operation.

This tradeoff, fewer network calls in exchange for delayed sending, is the core of buffering. It keeps instrumentation cheap and fast. But it creates a problem: data sitting in memory can be delayed or lost if the process, page, or serverless invocation ends before the buffer is sent.

What Flush Means

Flushing is the client’s attempt to send the buffered queue. It can happen automatically or manually.

Most clients flush on two automatic triggers. First, a time trigger: send whatever is queued every N milliseconds. Second, a size trigger: send when the queue reaches a certain number of items. LaunchDarkly’s SDK documentation explains that their SDKs automatically flush pending analytics events at intervals to avoid constant network requests.

Manual flushing means the developer calls flush() or forceFlush() directly. This is useful at lifecycle boundaries: before a CLI exits, at the end of a serverless handler, or during a graceful server shutdown.

One thing to be clear about: flush does not mean delivered. It means the client attempted to drain its queue. The network can be down, the runtime can terminate mid-send, the payload can exceed size limits, or the backend can reject the request. Sentry’s documentation makes this concrete: their flush(timeout) method waits up to a maximum time, and transports can drop events if sending fails because of connection loss.

The Configuration Knobs

Configuring buffering and flush behavior in a JS client usually comes down to a handful of settings. Here is what each one controls and when to adjust it.

Flush Interval

The flush interval is the time-based trigger. It determines the maximum normal delay before buffered data is sent, assuming the process stays alive.

Common names: flushInterval, scheduledDelayMillis, export interval.

OpenTelemetry’s BatchSpanProcessor defaults to a scheduledDelayMillis of 5,000 ms. Segment’s Node library uses a flushInterval of 10,000 ms. Lower intervals mean fresher data but more network calls.

Flush-At Count and Batch Size

This trigger fires when the queue reaches a certain number of items. It reduces per-event overhead but increases latency for low-volume applications.

Common names: flushAt, batchSize, maxExportBatchSize.

Segment’s Node library defaults to flushAt: 15, meaning 15 queued messages trigger a send. OpenTelemetry defaults to a maxExportBatchSize of 512. Segment Classic also documents a maximum of 500 KB per batch request and 32 KB per individual call.

Max Queue Size

Every buffer needs a cap. Without one, a network outage or event storm can turn telemetry into a memory problem.

OpenTelemetry’s spec drops spans once maxQueueSize (default 2,048) is reached. The AWS Powertools RFC for TypeScript proposes limiting buffers by entry count or byte size, with a sliding-window approach that drops oldest items after the buffer exceeds its threshold.

Timeout

Flushes should have timeouts. A flush during shutdown should not wait forever if the network is down. Sentry’s flush(timeout) waits up to a maximum time in milliseconds. OpenTelemetry’s spec requires that ForceFlush and Shutdown report success, failure, or timeout.

Retry Count

Some clients retry failed sends. Segment’s Node library supports maxRetries: 3. Retries improve resilience but can delay shutdown if the network is unreachable.

Here is a summary table:

Option What it controls Common names Tradeoff
Flush interval Time between automatic sends flushInterval, scheduledDelayMillis Lower = fresher data, more requests
Flush-at count Queue length that triggers send flushAt, batchSize, maxExportBatchSize Larger batch = fewer requests, more delay
Max queue size Maximum buffered items maxQueueSize, buffer limit Prevents memory growth, may drop data
Timeout Max time a flush can wait timeout, exportTimeoutMillis Prevents hanging, can lose data
Retry count Resend attempts on failure maxRetries, fetchRetryCount More resilience, slower shutdown

Flush vs Close vs Shutdown

This distinction trips up a lot of developers. A Reddit user working with the InfluxDB JavaScript client described their confusion: calling writeApi.close() in a loop prevented continuous sending. The community reply explained that close() is meant for when all work is done, because it flushes remaining data and cancels pending retries.

Here is how to think about these methods:

  • flush(): Attempt to send pending data now. The client usually remains usable afterward.
  • forceFlush(): Stronger wording used by observability SDKs like OpenTelemetry. Still timeout-bound.
  • close(): Flush pending data, release resources, stop accepting new work. Sentry’s docs say that after close(), the current client cannot be used anymore.
  • shutdown(): Final lifecycle call. Should normally be called once.

The rule is simple: use flush() at lifecycle boundaries when you want the client to keep working. Use close() or shutdown() when you are done with the client entirely.

Do not call close() after every request on a long-lived, reusable client. That is the most common mistake developers make when configuring buffering and flush behavior in a JS client for the first time.

How to Choose Flush Behavior by Runtime

The right flush configuration depends entirely on how long your JavaScript process lives. A Node.js API server and a Cloudflare Worker have completely different lifecycle constraints. If you are new to why serverless changes application lifecycle, it is worth understanding the basics before choosing a flush strategy.

Long-Running Node.js Server

Use automatic batching. Set a reasonable flushInterval and flushAt. Flush on graceful shutdown when the process receives SIGTERM.

Do not await flush() on every request unless the data volume is tiny or you are debugging. Per-request flushing adds latency and defeats the purpose of batching.

const client = createMetricsClient({
  flushInterval: 10_000,
  flushAt: 50,
  maxQueueSize: 1_000,
});

process.on("SIGTERM", async () => {
  server.close();
  try {
    await client.flush({ timeout: 2_000 });
  } finally {
    process.exit(0);
  }
});

Serverless Functions and Edge Handlers

This is where configuring buffering and flush behavior in a JS client gets interesting, and where most data loss happens.

In a serverless function, the handler may return before background async work completes. AWS Lambda documentation shows that unfinished callbacks can resume in later invocations if the handler exits first, which means your metrics might arrive attached to the wrong request, or never arrive at all.

Practitioners on Reddit report this exact class of failure. One developer described Lambda functions under one second where an extension’s log queue did not flush before termination, and artificially adding delays improved reliability. That is a workaround, not a solution.

The real fix is to flush explicitly or use the platform’s lifecycle hook.

Cloudflare Workers

Cloudflare Workers provide ctx.waitUntil() to extend the Worker’s lifetime after the response is sent. Cloudflare’s documentation lists external analytics providers as a common use case, but notes a 30-second limit and recommends Queues for work that must be guaranteed.

export default {
  async fetch(request, env, ctx) {
    const metrics = createMetricsClient(env.METRICS_TOKEN);
    metrics.count("request_total", 1);
    ctx.waitUntil(metrics.flush());
    return new Response("ok");
  },
};

This pattern keeps the flush out of the response path. The user gets their response immediately, and the metrics send happens in the background. For a complete working example, see the Cloudflare Workers metrics quickstart or the guide on instrumenting Cloudflare Workers with counters and histograms.

Vercel and Next.js

Next.js provides after() for post-response work. In serverless contexts, it relies on a waitUntil(promise) primitive to extend invocation lifetime until promises settle.

import { after } from "next/server";

export async function GET() {
  const metrics = createMetricsClient();
  metrics.count("request_total", 1);

  after(async () => {
    await metrics.flush();
  });

  return Response.json({ ok: true });
}

For Vercel-specific setup details, the Vercel metrics quickstart walks through the complete pattern.

Browser Pages

Browser telemetry is inherently best-effort. MDN recommends sending end-of-session analytics on visibilitychange, not unload or beforeunload, because those events are unreliable on many mobile browsers.

sendBeacon() is designed for this use case. It is asynchronous, does not block navigation, and sends data via POST. But the queued data limit is 64 KiB. The fetch API with keepalive: true has the same 64 KiB body limit.

document.addEventListener("visibilitychange", () => {
  if (document.visibilityState === "hidden") {
    navigator.sendBeacon("/metrics", JSON.stringify(buffer.drain()));
  }
});

Keep payloads small. Do not try to flush a large queue during page close.

CLI, Scripts, and Tests

Short-lived programs exit before background queues drain. Always await flush() or await close() before the process exits. Use a timeout so the process does not hang forever.

LaunchDarkly’s documentation says that short-lived processes may close before the SDK has a chance to flush and recommends calling flush() manually. For test environments, setting flushAt: 1 can be useful for immediate visibility, though it should not be a production default.

Where Should Flush Live in Your Code?

A developer on the InfluxData community forum asked a question that reveals a common gap in SDK documentation: in a REST API with async writes, should flush() belong in model functions, request handlers, or a periodic background job?

The answer depends on the application lifecycle:

  • Long-running server: Let automatic intervals handle it. Add explicit flush only on SIGTERM or graceful shutdown.
  • Serverless handler: Flush at the end of the handler, or schedule it with ctx.waitUntil() / after().
  • Browser page: Flush on visibilitychange.
  • CLI or test: Flush before exit.

The underlying principle is this: flush belongs at lifecycle boundaries, not scattered through business logic.

Common Mistakes When Configuring Buffering and Flush Behavior

Flushing After Every Event in Production

Setting flushAt: 1 in production destroys the main benefit of buffering. Segment Classic is explicit about this: batching can be disabled by setting flushAt to 1, but this is useful for debugging, not performance-sensitive environments.

Forgetting to Flush Before Exit

If the process ends before the buffer drains, the data is gone. This is the single most common source of missing telemetry in serverless environments and short-lived scripts.

Confusing Close With Flush

Using close() when you mean flush() can disable the client mid-lifecycle. InfluxDB’s JavaScript write guide uses writeApi.close() as the final step to flush pending writes and close the API. It is not meant to be called repeatedly.

Letting Queues Grow Without Bounds

Unbounded buffers can crash your application during network outages or event spikes. OpenTelemetry drops spans when the queue is full. AWS Powertools’ RFC discusses byte-size caps and sliding-window drop policies. Metrics and analytics should not be allowed to bring down the application they are measuring.

Assuming Flush Means Guaranteed Delivery

Reddit practitioners discussing analytics libraries pointed out that “no event should be lost” is unrealistic in arbitrary browser environments because scripts can be blocked, networks can fail, and the environment is not controlled. This matches the constraints documented by MDN, Cloudflare, and OpenTelemetry.

Ignoring Race Conditions During Flush

PostHog GitHub issues reveal a subtle edge case: events captured while a flush is already in progress can remain queued until the next flush, even with flushAt: 1 and flushInterval: 0. Another issue shows that asynchronously constructed events may not be known to flush() or shutdown() yet.

The lesson: if the event creation itself is async, await the enqueue step before flushing. And test the exact client version you are using.

Durable Telemetry vs Freshness-Only Telemetry

Not all buffered data deserves the same treatment. Socket.IO’s documentation provides a useful distinction: events emitted while disconnected are buffered and replayed on reconnection, which can create a spike. For events where only the latest value matters (cursor positions, typing indicators), Socket.IO offers volatile events that are intentionally dropped when the connection is not ready.

Apply this thinking to your own telemetry. Metrics counters can tolerate batching and brief delays. Realtime data that is stale by the time it arrives probably should not be buffered and replayed at all.

Practical Checklist

Before shipping your flush configuration, answer these questions:

  1. Is this application long-running or short-lived?
  2. What is the acceptable delay before telemetry reaches the backend?
  3. How many items can the buffer hold before memory becomes a concern?
  4. What happens when the queue is full? Drop oldest, drop newest, or block?
  5. What timeout should flush use? (Two to five seconds is a reasonable starting point for most serverless handlers.)
  6. Does calling close() or shutdown() disable the client?
  7. How is flush failure reported? Promise rejection? Callback? Silent drop?
  8. Have you tested behavior on process exit, network failure, and serverless timeout?

If you are building metrics instrumentation for a JavaScript application, the Distlang Metrics JavaScript client guide covers these patterns with concrete examples.

Putting It All Together

Configuring buffering and flush behavior in a JS client is not about finding the perfect flushInterval value. It is about matching your flush strategy to your runtime lifecycle. The correct configuration for a Node.js API server is wrong for a Cloudflare Worker. The right approach for a browser page is wrong for a CLI script.

The defaults from major SDKs are a reasonable starting point:

  • OpenTelemetry: 5-second interval, batch size 512, queue limit 2,048
  • Segment: 10-second interval, flush at 15 items, 3 retries
  • LaunchDarkly: 5-second automatic flush interval

Adjust from there based on your volume, latency tolerance, and runtime constraints.

For serverless metrics specifically, the pattern is consistent across platforms: record metrics during the request, flush after the response using the platform’s lifecycle primitive, and accept that delivery is best-effort unless you add a durable queue. The metrics quickstart shows how to go from instrumentation to a working dashboard in minutes, and the end-to-end app-to-dashboard example walks through the full flow.

Related Terms

  • Backpressure: When a consumer cannot keep up with a producer. In Node.js streams, write() returning false means callers should wait for 'drain'. Related to buffering but operates at a lower level.
  • Batch: A group of items sent together in one network request. Batching is the primary reason JS clients buffer data.
  • Drain: In Node.js streams, an event indicating the buffer has been emptied and writing can resume. Sometimes used loosely in SDK contexts to mean “empty the queue.”
  • Volatile event: An event intentionally dropped if the connection is not ready, as in Socket.IO. Useful for freshness-only data.
  • Beacon: The browser’s sendBeacon() API, designed for analytics payloads during page unload, with a 64 KiB limit.

FAQ

Does flush() guarantee that my data was delivered?

No. Flush means the client attempted to send what was in its buffer. The attempt can fail because of network errors, timeouts, payload size limits, runtime termination, or ad blockers. Some SDKs resolve the flush promise after the attempt regardless of success. Always check your specific SDK’s semantics.

Should I call flush() after every event?

Usually not. That eliminates the performance benefit of batching. Use per-event flushing only in tests, debugging, very low-volume workloads, or when a specific event is critical enough to justify the extra latency and network calls.

What is the difference between flush() and close()?

flush() attempts to send pending data while keeping the client usable. close() or shutdown() flushes pending data and then disables or releases the client. Sentry’s documentation makes this explicit: after close(), the client cannot be used anymore. Use flush() at lifecycle boundaries and close() only when you are done with the client.

How do I avoid losing metrics in serverless functions?

Flush before the handler exits, or schedule the flush using the platform’s background-lifetime primitive. In Cloudflare Workers, that is ctx.waitUntil(client.flush()). In Next.js on Vercel, use after(). For telemetry that absolutely must arrive, push to a durable queue rather than relying on best-effort background sends.

What should my flushInterval be?

It depends on your latency tolerance and traffic volume. OpenTelemetry defaults to 5 seconds. Segment defaults to 10 seconds. For serverless handlers that live for milliseconds, the interval may never fire, so you need manual or lifecycle-triggered flushing instead.

What happens if my buffer fills up?

That depends on the SDK’s drop policy. OpenTelemetry drops new spans once maxQueueSize is reached. Other clients may drop the oldest items. Without a max queue size, a network outage can cause unbounded memory growth. Always configure a cap.

Can I use the Metrics API directly instead of a JS client?

Yes. If you are working outside JavaScript or want full control over batching and sending, you can send metrics over HTTP with bearer token authentication. The JS client handles buffering and flushing for you, but the raw API is available for any language or custom integration.

How do I test that flush actually sent my data?

Use a combination of approaches: check the SDK’s flush promise resolution, inspect network requests in browser dev tools or a local proxy, look at your metrics backend for received data, and test edge cases like process exit and network failure. Some SDKs offer a debug or verbose logging mode that logs send attempts.