---
title: Null Bytes, Dead Streams, Last Chunk
url: https://varstatt.com/jurij/p/null-bytes-dead-streams-last-chunk
author: Jurij Tokarski
date: 2026-04-24 08:37
description: SSE adds overhead for mixed event types, silent streams hang without error, and the last audio chunk vanishes on page close — three workarounds for LLM streaming.
section: Blog (https://varstatt.com/jurij/archive)
tags: ai (https://varstatt.com/jurij/c/ai), software-design (https://varstatt.com/jurij/c/software-design)
---

Streaming LLM output to a browser means wiring together SSE, TCP, fetch, and browser lifecycle APIs that weren't designed for this combination. Each one has constraints that only surface when you integrate them.

## The Parser That Choked

Server-Sent Events is the natural choice for streaming. SSE supports multiple event types via the `event:` field and handles multiline JSON by splitting across `data:` lines. But when every chunk needs an `event:` line, one or more `data:` lines, and a blank line delimiter — and you're sending hundreds of small text fragments interleaved with structured tool call events — the framing adds up and the parser becomes more complex than the problem requires.

A null byte as the delimiter is simpler. `\0` is rare enough in practice — it can appear as `\u0000` in JSON but almost never does in LLM output or natural language — that it works as a reliable record separator without escaping.

```javascript
// Server: wrap each event
function sendEvent(stream, event) {
  stream.write(JSON.stringify(event) + '\0');
}

// Client: split and route
let buffer = '';
decoder.on('data', (chunk) => {
  buffer += chunk;
  const parts = buffer.split('\0');
  buffer = parts.pop(); // keep the incomplete trailing segment
  for (const part of parts) {
    if (part) handleEvent(JSON.parse(part));
  }
});
```

Each event is a JSON object with a `type` field — `text_chunk`, `tool_call`, `tool_result`, `done`. The client splits on null bytes, parses each segment, routes by type. Text chunks accumulate in the UI. Tool events trigger loading states or commit structured data.

## The Stream That Stopped Talking

TCP keepalive keeps a connection open. It doesn't tell you the connection has gone silent at the application level. Occasionally — maybe once every few hundred sessions — a stream stops mid-sentence. No error event. No close event. The connection is alive, the response is still "streaming," and the user is staring at a half-finished message with a spinner that will never resolve.

The LLM API hasn't errored — it just stopped sending chunks.

An idle timer catches this. Reset it on every incoming chunk. Fire it if silence crosses a threshold.

```javascript
let idleTimer;

function resetIdleTimer(controller) {
  clearTimeout(idleTimer);
  idleTimer = setTimeout(() => {
    controller.abort();
  }, 30_000);
}

stream.on('data', (chunk) => {
  resetIdleTimer(controller);
  processChunk(chunk);
});

stream.on('end', () => {
  clearTimeout(idleTimer);
});
```

Thirty seconds is generous for interactive chat — users notice after five. The threshold isn't the important part. The pattern is: connection-level timeouts don't catch application-level silence. You need to track it yourself.

## The Chunk That Vanished

Browsers kill in-flight `fetch()` calls during page unload. If you stream audio in chunks via POST, the final chunk — whatever is still buffered when the user stops recording or closes the tab — lives in memory until the next flush. That flush never happens. The final segment of every session is silently dropped.

```javascript
// Killed on page close:
await fetch('/v3/audio/stream_chunk', { method: 'POST', body: chunk });

// Survives:
fetch('/v3/audio/stream_chunk', {
  method: 'POST',
  body: chunk,
  headers: { Authorization: `Bearer ${token}` },
  keepalive: true,
});
```

No `await`. No `.then()`. You can't await a response during unload — any result is swallowed. Fire and forget. The browser queues the request and completes it even after the page is gone, as long as the total payload is under ~64KB.

`navigator.sendBeacon()` survives unload too, but it doesn't support custom headers. If your backend expects an auth header, `fetch({ keepalive: true })` gives you the full request API.

## The Gaps Between Protocols

Every integration has these. You wire together two or three tools that work fine on their own, but nobody tested them together — and no documentation covers the seams. The workarounds aren't published as best practices. They accumulate as know-how, one project at a time. These are three I've accumulated for LLM streaming.
