attachments
Turn anything into LLM-ready context. One function. Zero required dependencies.
Errors never raise — they come back as data that teaches:
What it eats
Formats — each an optional install, so you only carry what you use:
.pdf + OCR.docx.pptx.xlsx.xls.csv/.tsv.html + CSS select.ipynb.md .py .json… 20+ text.png .jpg .webp… 8 image.heic.svg.mp3 .wav .m4a… transcription
Sources — every source works with every format:
filesdirectoriesglobs **/*.pyzip / tarhttps://github://owner/repo
Mislabeled or extensionless files route by magic bytes. Scanned PDFs OCR
automatically when the extra is installed. Every option autocompletes in
your editor and is discoverable at runtime: att.options(".pdf")
You can leave at any time
Everything on this page is open source (MIT) — the library, the parsers, the spec, and the entire hosted service. Self-host it with one command:
No telemetry, ever. No accounts. The library makes no network calls unless you point it at a URL — or explicitly opt into the service below.
The hosted tier — free
Some models are heavy. OCR wants onnxruntime; transcription wants whisper weights. We keep them warm on a server so you don't have to install them:
- No API key, no signup. Files up to 25 MB, rate-limited, processed in memory and not stored.
- It's the same open-source server you can self-host — the hosted tier is convenience, not capability.
- CPU processing is cheap for us and useful for you; that's the whole business model of the free tier. GPU OCR (LightOnOCR) is coming for the documents that need more.
Built on a one-page contract
The Artifact
Every input becomes {text, images, audio, video, meta} —
typed errors, page/sheet/slide segments with offsets, validated against a
JSON Schema
by a conformance suite in CI.
The DSL
"file.pdf[pages: 1-4, ocr: true]" — a
specified grammar
with shared test vectors, so every future language port parses it identically.
Every option has a kwargs twin.
Adding a format is one pure function (bytes, options) → Artifact —
the contributor guide
is a checklist, and the conformance suite picks your processor up automatically.