Batching Payload Format and Size Limits for Ingestion: 2026


What “Batching Payload Format and Size Limits for Ingestion” Means

The phrase breaks into three parts:

Payload format is the shape and encoding of the request body. It could be a JSON array, an object wrapping an events array, newline-delimited JSON, OTLP Protobuf, or line protocol. The receiver must know how to parse it, and the sender must construct it correctly.

Size limits are the maximums the system enforces on individual events, full request bodies, compressed payloads, decompressed payloads, or downstream processor inputs. These limits exist at multiple layers, not just one.

Ingestion is the API or pipeline that accepts telemetry data (metrics, logs, traces, events) for processing and storage.

Put plainly: batching payload format and size limits for ingestion are the contract between your application and a data collection endpoint. They answer “how do I package multiple items in one request?” and “how big can that request be?”

Why Batching Exists

One HTTP request carrying 100 metric points costs less network, TLS handshake, CPU, and server work than 100 individual requests. Inngest describes batching as useful for reducing external API calls, database transactions, and serverless invocation costs. InfluxDB gives similar guidance for its write path: batch data to minimize network overhead, targeting 10,000 lines or 10 MB, whichever comes first.

But batching introduces failure modes that single-event requests don’t have:

  • The entire request can exceed a gateway or API limit.
  • A single oversized event can poison the batch.
  • Large batches increase memory pressure during serialization.
  • Count-based batching breaks when event sizes vary wildly.
  • In serverless environments, a function may exit before the client flushes its buffer.
  • A successful HTTP response may only mean “queued,” not “indexed and visible.”

Batching is worth doing. But it requires awareness of format rules and size constraints at every layer.

Common Batching Payload Formats

Different ingestion APIs expect different payload shapes. Getting the format wrong produces parse errors or silently dropped data, regardless of size.

JSON Array

The simplest pattern. A batch is a JSON array of event objects:

[
  {"name": "checkout.completed", "value": 1, "timestamp": "2026-05-03T12:00:00Z"},
  {"name": "checkout.failed", "value": 1, "timestamp": "2026-05-03T12:00:01Z"}
]

Honeycomb’s batch Events API uses this approach, where each element contains a data key with the event payload. Honeycomb also notes that an empty 202 response means the event has been queued for processing, not necessarily indexed.

JSON arrays are easy to debug and natural for browser or serverless clients. The downside: repeated field names inflate payload size, and one malformed element can affect the entire request unless the API supports per-item error reporting.

Object Wrapper with Events Array

A batch wrapped under a named key:

{
  "events": [
    {"type": "metric", "name": "api.requests", "value": 1},
    {"type": "metric", "name": "api.latency_ms", "value": 42}
  ]
}

OpenSearch Ingestion’s Lambda processor groups records under a configurable key_name (often events), and the Lambda handler reads from that key. AWS documents that the Lambda response must be in JSON array format.

This format allows batch-level metadata (a batch ID, schema version, or client info) alongside the records. The tradeoff: sender and receiver must agree on the wrapper key name, and an incorrect key produces empty batches.

When working with the Distlang Metrics API, understanding this wrapper pattern helps you structure requests correctly for any ingestion endpoint.

Newline-Delimited or Stacked JSON Objects

Some systems skip arrays entirely. Splunk HEC’s batch protocol stacks event objects one after another rather than wrapping them in a JSON array:

{"event":"event 1","time":1447828325}
{"event":"event 2","time":1447828326}

This is stream-friendly and avoids holding a complete array in memory. But it confuses developers who expect standard JSON. Practitioners on Reddit’s Splunk community report that HEC batching produces invalid-format errors when constructed like a normal JSON array. The batch is not valid JSON as a single document, and that trips people up.

OTLP Protobuf and OTLP JSON

OpenTelemetry Protocol (OTLP) defines telemetry exchange over gRPC and HTTP using Protocol Buffers schemas. For OTLP/HTTP, telemetry is sent via POST to paths like /v1/traces, /v1/metrics, and /v1/logs, with the body being either binary Protobuf or JSON-encoded Protobuf.

OTLP is vendor-neutral and efficient in binary form. The challenge is that batch sizing by record count can still exceed byte or message limits. New Relic recommends OTLP/HTTP binary Protobuf and states that payloads must be smaller than 1 MB. Datadog’s OTLP metrics endpoint returns 413 when payloads exceed 500 KB uncompressed or 5 MB after decompression.

Line Protocol

InfluxDB’s line protocol is optimized for time-series writes:

api_requests,route=/checkout,status=200 count=1 1714560000000000000
api_latency_ms,route=/checkout p50=42,p95=120 1714560000000000000

Compact, appendable, and easy to compress. InfluxDB recommends 10,000 lines or 10 MB per batch. The cost is format-specific escaping rules and less self-describing payloads compared to JSON.

The Three Batching Thresholds: Count, Bytes, and Time

The safest model for batching payload format and size limits for ingestion is “flush when the first threshold is hit.” Do not wait for all three. If you reach your record count, send. If you hit your byte limit first, send. If the time window expires first, send.

This “first threshold wins” pattern appears across many systems:

  • AWS Lambda event source mappings invoke when the batching window expires, the record count is met, or the payload reaches 6 MB.
  • OpenSearch Data Prepper’s Lambda processor exposes event_count: 100, maximum_size: 5mb, and event_collect_timeout: 10s as defaults.
  • Inngest batches until maxSize or timeout, with a 10 MiB safety limit.
  • Microsoft Kusto’s ingestion batching policy uses count (default 500 items), raw data size (default 1024 MB), and time (default 5 minutes).

An example configuration:

max_records: 100
max_encoded_bytes: 400000
max_wait_ms: 2000

The exact numbers depend on your target API. The principle is universal.

Size Limits Are Layered, Not Singular

One of the most common mistakes when dealing with batching payload format and size limits for ingestion is assuming “the size limit” is one number. In practice, a request can fail at multiple layers.

Individual Event Size

A single record can be too large even if the batch itself is within limits. AWS OpenSearch Ingestion’s Lambda processor enforces a 5 MB payload size limit for a single event. Practitioners on Reddit’s Elasticsearch community describe 413 errors caused by a single oversized log line, not total volume. Large stack traces, request bodies, or AI prompt/response payloads can blow up a single record.

Batch Request Body Size

This is the most visible limit, the one that produces 413 errors. Datadog’s metrics endpoint enforces a 500 KB maximum payload size for certain endpoints. New Relic OTLP requires payloads under 1 MB. These are often smaller than developers expect.

Compressed vs. Decompressed Size

This catches people. Compression helps bandwidth, but many systems enforce the decompressed size. Datadog documents both compressed and decompressed limits: 500 KB uncompressed or 5 MB for compressed payloads after decompression. Kusto evaluates batching policy data size based on uncompressed data. A 300 KB gzipped body that decompresses to 6 MB will still be rejected.

Gateway, Proxy, and WAF Limits

Your ingestion API limit is irrelevant if a gateway rejects the request first. AWS API Gateway’s HTTP API has a 10 MB payload limit that cannot be increased. Multiple Reddit threads show developers hitting this wall and being told to use a different architecture entirely, such as direct object storage uploads.

In one particularly instructive case from a Langfuse GitHub discussion, a user’s ingestion API returned success codes, but data was missing from dashboards. The root cause turned out to be an AWS Managed WAF size restriction rule on the load balancer in front of ClickHouse, which blocked certain requests and returned 403 errors to the downstream worker.

Processor and Function Payload Limits

If ingestion involves a Lambda function, enricher, or processor in the pipeline, that component has its own limits. AWS Lambda event source mappings cap batch payloads at 6 MB. OpenSearch Data Prepper defaults to 5 MB maximum batch size.

Backend Queue and Write Limits

Prometheus remote write illustrates this layer. If a shard queue fills, Prometheus blocks reading from the WAL. If the remote endpoint stays down for more than two hours, unsent data can be lost after WAL compaction. Batch size tuning affects queue memory, retry behavior, and data loss under backpressure.

What Happens When a Batch Is Too Large

Understanding the failure modes is just as important as knowing the limits:

413 Payload Too Large. The most direct rejection. Elastic’s OTLP troubleshooting docs note that 413 errors happen more often during traffic spikes or when individual telemetry items are large. The fix is to lower batching limits so each request stays smaller.

400 Malformed Request. Wrong payload format, invalid encoding, or a mix of valid and invalid records can trigger this.

Partial success. Some APIs return 207 Multi-Status with per-item error details. Others reject the whole batch. Know which behavior your target API uses.

Accepted but later dropped. A 200 or 202 may mean “queued,” not “indexed.” Honeycomb documents that an empty 202 means queued for processing. Langfuse maintainers explain that downstream processing failures are not surfaced in the ingestion API response.

Queue growth and memory pressure. Prometheus warns that remote write memory is proportional to number of shards * (capacity + max_samples_per_send), and its defaults are designed to constrain shard memory.

Serverless timeout. The function exits before the client flushes its buffer. The data never leaves the process. This is covered in detail below.

How to Choose a Safe Batch Size

Start with documented limits, then leave headroom

Do not target 100% of the published limit. JSON escaping adds bytes. Compression ratios change. Labels and tags grow over time. Proxies may enforce lower limits. Aim for 50 to 80% of the documented maximum.

If the ingestion API accepts 500 KB, set client flush around 350 to 400 KB.

Measure encoded bytes, not object count

An OpenTelemetry Collector GitHub issue requested a send_batch_max_size_bytes option because the batch processor counts spans, not bytes. A few large spans can exceed gRPC message limits even when span count is well under the batch size setting. Batch count is not batch size. This is a critical distinction.

In JavaScript, you can estimate size like this:

const encodedBytes = new TextEncoder().encode(JSON.stringify(payload)).length;

This works for UTF-8 JSON. OTLP Protobuf and gzip require different measurement.

Handle oversized single events deliberately

Do not let one giant record poison the whole batch. Options include dropping with a logged metric, truncating safe fields, or sending the oversized item to object storage with a reference. The Elastic troubleshooting docs reinforce this: tune for peak and p99 event size, not just averages.

Make retries and idempotency explicit

Retry transient errors (429, 500, 502, 503, 504). Do not retry permanent 400 schema errors without modification. Use idempotency keys if duplicates matter. AWS Lambda docs warn that event source mappings process events at least once and recommend idempotent function code.

Another OpenTelemetry Collector issue reports that when a downstream exporter rejects data, the batch processor can drop data if no exporter queue is configured. Batching must be paired with retry, queue, or dead-letter behavior. Otherwise, larger batches amplify the impact of a single downstream failure.

LinkedIn practitioner posts echo this: high-volume OpenTelemetry pipelines need filtering, compression, batch tuning, and sometimes exporter-level batching. Dash0 warns that in-memory batch processors can lose in-flight traces during crashes or node rotations. The modern position is clear: batch, but make batching observable and durable.

Serverless Metrics Ingestion Considerations

Serverless functions are short-lived. They may handle a single request and then the runtime freezes or terminates. Any metrics client buffering data in memory needs to flush before that happens.

A Langfuse maintainer directly recommended flushing before the Lambda function exits, noting there was no need to flush after each API call, just before the handler returns.

For serverless batching payload format and size limits for ingestion, the practical rules are:

  1. Buffer during the request. Accumulate metric points as the handler runs.
  2. Flush at the handler boundary. Use lifecycle hooks like ctx.waitUntil(...) on Cloudflare Workers or after(...) on Vercel/Next.js to send the batch without blocking the response.
  3. Keep batches small. A serverless handler may not have time to retry a large failed batch. Smaller batches are more forgiving.
  4. Prefer a client with explicit flush semantics. A client that lets you control when data is sent is safer than one that relies on background timers.

Distlang Metrics is built for this pattern. The JavaScript client supports buffering and explicit flush, and Distlang provides environment-specific quickstarts for Cloudflare Workers and Vercel/Next.js with lifecycle-aware examples. There are no agents, sidecars, or infrastructure to manage.

Reference: Size Limits and Batching Defaults Across Systems

System Format Key Limit or Default
AWS OpenSearch Lambda processor JSON array, events wrapper 5 MB per single event
OpenSearch Data Prepper Configurable key_name 100 events, 5 MB, 10s timeout
AWS Lambda event source Batch of records 6 MB payload, not modifiable
Inngest events array in function maxSize + timeout, 10 MiB safety cap
Datadog Metrics API JSON timeseries 500 KB max, 5 MB decompressed
New Relic OTLP OTLP HTTP Protobuf Under 1 MB
InfluxDB Line protocol 10,000 lines or 10 MB
Microsoft Kusto Queued ingestion 500 items, 1024 MB raw, 5 min
OTel Collector batch processor Spans/metric points/logs 8192 send_batch_size, 200ms timeout
AWS API Gateway HTTP API HTTP body 10 MB hard limit
Splunk HEC Stacked JSON objects (not array) Configurable max_content_length

Sources linked throughout the article above.

Checklist: Before Shipping an Ingestion Batcher

Before putting batched ingestion into production, confirm each of these:

  • [ ] Required payload format (JSON array, wrapper object, NDJSON, Protobuf, line protocol)
  • [ ] Maximum individual event size
  • [ ] Maximum batch request body size
  • [ ] Whether the limit applies to compressed or decompressed bytes
  • [ ] Gateway, proxy, or WAF payload limit
  • [ ] Maximum records per request
  • [ ] Maximum wait time before flush
  • [ ] Retry behavior for transient errors
  • [ ] Partial failure behavior (whole batch rejected vs. per-item errors)
  • [ ] Idempotency or duplicate handling
  • [ ] Serverless flush lifecycle (waitUntil, after, equivalent)
  • [ ] Internal metrics for queue length, dropped records, and 413 count

If you want to skip the infrastructure complexity and start sending metrics from serverless code in minutes, the Distlang Metrics quickstart walks through the setup from code to dashboard.

FAQ

What is a batching payload?

A batching payload is a request body containing multiple records for ingestion. The records could be metric points, events, log entries, spans, or data rows. The payload format defines how those records are structured (as an array, under a wrapper key, as newline-delimited objects, etc.) and the size limit defines how large the payload can be.

Is batch size the number of records or the number of bytes?

Both matter, and conflating them is a common source of production failures. Many APIs expose a record-count batch setting, but payload rejection usually happens on bytes. An OpenTelemetry Collector issue demonstrates this: count-based batch limits can still produce oversized messages when individual spans are large. Always measure encoded byte size alongside record count.

Does gzip compression let me exceed the payload limit?

Not always. Many systems enforce decompressed size limits. Datadog’s OTLP metrics endpoint documents both a 500 KB uncompressed limit and a 5 MB decompressed limit. Compression saves bandwidth, but the receiver may still reject based on what the data expands to.

Why did the API return 200 or 202 but my data is not visible?

The API likely accepted or queued the data, while downstream processing failed separately. Honeycomb documents that 202 means queued for processing, and Langfuse maintainers explain that downstream failures may not appear in the ingestion response. If data is missing, check worker logs, downstream queue health, WAF/proxy logs, and backend write errors. For a concrete example of the full path from ingestion to dashboard visibility, see this end-to-end walkthrough.

What is a safe default batch size?

There is no universal number. Use the target API’s documented limit, then set your client threshold 20 to 50% below that. Flush on count, bytes, or time, whichever comes first. A reasonable starting point for many JSON-based metrics APIs: 100 records, 400 KB encoded, 2 seconds.

How do batching limits differ in serverless environments?

Serverless functions have short lifetimes and can freeze or terminate between invocations. A background timer may never fire. The key difference is that you must flush explicitly at lifecycle boundaries (using mechanisms like waitUntil or after) and keep batches small enough to send within the remaining execution time. For practical patterns, see guides on instrumenting Cloudflare Workers with counters and histograms.

Should I use JSON or Protobuf for ingestion batching?

JSON is easier to debug and works well for lower-volume or serverless use cases. OTLP Protobuf is more efficient on the wire and better for high-volume pipelines. The format choice should match your ingestion target’s supported formats, your team’s debugging needs, and your volume. If you are sending lightweight counters and histograms from JavaScript, JSON is usually the simpler choice.

What are the most common batching mistakes?

Using record count as a proxy for byte size. Forgetting decompressed size limits. Letting serverless functions exit before flushing. Treating HTTP 200/202 as proof that data is queryable. Increasing batch size without measuring memory impact. Each of these has caused real production incidents documented in community threads and vendor troubleshooting guides.