● Specification v1.0 — Stable

The data format for the AI era

Indent Comma Format combines the compactness of CSV, the readability of YAML and the hierarchy of JSON — schema-driven, streamable, and token-efficient by design.

application/icf MIME type Zero-dependency Java & Python libraries CC BY 4.0 open specification
invoice.icf
@kind icf
@version 1.0
@specification https://icformat.org/icf/specification/v1

@schema

Invoice:
  [InvoiceNo, InvoiceDate, Amount]

BillItems[]:
  [SNo, Item, Quantity, Rate, Amount]

@data

@record id=INV001

Invoice:
  = INV-2026-001, 2026-05-01, 84500

BillItems:
  - 1, Cement, 100, 420, 42000
  - 2, Steel Rod, 50, 850, 42500

Declare the schema once. Store every record positionally. No repeated keys, no closing tags.

Why ICF

Compact, readable, and built to stream

ICF minimizes repeated keywords by defining schemas once and storing subsequent data positionally — smaller files, faster parsing, and far fewer AI tokens.

CSV-level compactness

Field names live in the schema, not every row. Records are bare comma-separated values — the smallest honest representation of structured data.

YAML-level readability

Indentation expresses hierarchy. No brackets to balance, no quotes to escape — a human can read and hand-edit an ICF file with ease.

JSON-level hierarchy

Objects, collections, nested containers and master data — the full shape of a business document, not a flat table.

AI-efficient tokens

Fewer repeated keys means fewer tokens. Ideal for RAG datasets, LLM context windows and structured extraction pipelines.

Streamable & append-friendly

Records are line-oriented and self-contained. Append new records without rewriting the file; process huge archives without loading them whole.

Schema-driven validation

The declared schema makes records predictable and verifiable. Field counts, hierarchy and ordering are all checkable up front.

Where it shines

Purpose-built for structured business data

ICF is intentionally schema-constrained — the trade that buys smaller files, faster parsing and predictable structure.

OCR extraction pipelines

Capture invoices and documents into a compact, hierarchical record stream with verbatim text blocks for raw OCR output.

Invoice & ERP interchange

Move structured documents between systems with master-data references that mirror relational tables.

Document archival

Store millions of records git-friendly and human-readable, indexed by a companion ICX for random access.

AI / RAG datasets

Feed models dense, low-token structured context instead of verbose JSON or XML.

Document management

Export and import entire DMS hierarchies — folders, files, index data and line items — in one file.

Human-editable records

Config and reference data a person can actually read and edit, with no brackets to balance.

Reference implementations

Zero-dependency libraries, ready to use

Faithful, behaviorally matched implementations for the JVM and Python — parse, validate, build, write and generate ICX.

icfj

org.icformat:icfj · Java 11+

Pure Core Java, no runtime dependencies. The reference implementation and conformance authority.

Java downloads & install →

icfpy

icfpy · Python 3.9+

Pure Python, standard library only. A faithful behavioral port of icfj.

Python downloads & install →
Tools

Validate ICF right in your browser

Paste an ICF document and get instant, structural feedback — schema field counts, indentation, text blocks and more. Nothing leaves your machine.

Open the validator