Format spec v1.0 · 10 languages · GPLv3

Get structured data out of
LLM text — reliably.

Teach the model to wrap each payload between unmistakable markers, then extract it with one pass that never re-parses content as code. It survives the quotes, newlines, backticks, and braces that shatter a naive JSON.parse.

$ npm install sentinel-blocks $ pip install sentinel-blocks
completion.txt
<<<JSON>>>
{ "summary": "one object, verbatim", "ok": true }
<<<END>>>

<<<FILE src/app.ts>>>
export const greet = (n) => `hi ${n}`;
// code lives OUTSIDE the JSON ↑
<<<END>>>
The problem

One stray brace and the parse dies

You ask a model for JSON. It mostly complies — until a value contains a quote, a newline, or a code snippet. Then your pipeline falls back to a mock, retries, or crashes. The root cause is almost always code stuffed into a JSON string.

the 2 a.m. bug
{ "code": "const re = /\{.*\}/;" }
                  ▲  JSON.parse →
  SyntaxError: Expected ',' or '}'
  in JSON at position 11693
The idea

Mark the boundary. Take the contents verbatim.

Stop fighting the escaping. Wrap each payload between sentinels and read the bytes between them untouched. Quotes, braces, and newlines inside a block are harmless because the block is never interpreted as code. Name a block to carry its own argument — a path, an id, a language.

Tag-only

<<<JSON>>><<<END>>>

A single payload, extracted whole and trimmed.

Named

<<<FILE src/app.ts>>><<<END>>>

Carries an argument; emit one per file, verbatim.

Why it works

Boring on purpose

01Verbatim out

Extraction returns the bytes between the markers untouched — no input can trigger parser edge cases on the way out.

02Code leaves the JSON

The most common JSON.parse failure — unescaped code in a string — is designed out, not patched up.

03Rare markers

<<< / >>> almost never occur in prose or code, so they survive markdown, diffs, and token streaming.

04Honest fallback

The JSON entry point tries the block, a light repair, then a balanced-brace slice — and fails loudly. It never fabricates.

The strategy

Three rules for the prompt

The format only pays off if the model emits it. Tell it:

Reply only in sentinel blocks.Nothing before the first block or after the last.
JSON carries data, never code.No source, no markdown fences inside the JSON block.
Code gets its own block.Files and long text go in named <<<FILE>>> blocks, verbatim.
One format, ten tongues

Drop-in ports, identical behavior

Every implementation passes the same 8-invariant conformance suite. npm and PyPI for the big two; a single dependency-free file for the rest.

TypeScriptnpm
JavaScript.mjs · .cjs
PythonPyPI
GoRE2-safe scan
Rustzero-dep
CC99
C++header-only
Java17
Rubystdlib
PHP8.x
Quickstart

Two lines to robust parsing

TypeScript
import { jsonFromResponse, extractTaggedBlocks }
  from "sentinel-blocks";

const meta = jsonFromResponse(reply);
// never breaks on code-in-string
for (const { arg, content } of
     extractTaggedBlocks(reply, "FILE")) {
  writeFileSync(arg, content);
}
Python
from sentinel_blocks import (
    json_from_response, extract_tagged_blocks)

meta = json_from_response(reply)
# dict; raises only if nothing parses
for arg, content in \
        extract_tagged_blocks(reply, "FILE"):
    open(arg, "w").write(content)