← Tech
💻Tech

Mistral Ships OCR 4 — Self-Hostable Document AI in 170 Languages

Mistral AI shipped OCR 4, a self-hostable document-AI model in 170 languages at $4 per 1,000 pages, topping benchmarks — a private alternative to cloud document APIs.

TL;DR — France’s Mistral AI shipped OCR 4, a self-hostable, structure-aware document-AI model supporting 170 languages at $4 per 1,000 pages, topping document-AI benchmarks and pitched as a private, on-premises alternative to cloud-only document APIs.

Europe’s leading AI lab has a new pitch for enterprises wary of the cloud. On June 23, 2026, Mistral AI shipped OCR 4, a self-hostable document model.

The release

Mistral AI shipped OCR 4, a structure-aware document-AI model supporting 170 languages and priced at $4 per 1,000 pages (dropping to $2 with a Batch-API discount). It posted top scores on document benchmarks — OlmOCRBench (85.20) and OmniDocBench (93.07) — with a 72% average human-preference win rate versus competitors, and runs in a single container for fully self-hosted deployment.

Metric OCR 4
Languages 170
Price $4 / 1,000 pages ($2 batch)
OlmOCRBench 85.20
Deployment Self-hosted (single container)

What they said

"The availability of Mistral Document AI with OCR 4 in Microsoft Foundry marks an important milestone in our partnership." — Kimmi Grewal, VP of AI Ecosystem Partnerships, Microsoft

Why it matters

  • Privacy as a feature. Self-hosting means sensitive documents never leave the enterprise.
  • Cost and speed. One customer cited roughly 8× lower cost and 17× lower latency than agentic parsers.
  • A European challenger. Mistral targets Google and AWS document APIs head-on.

FAQ

What is Mistral OCR 4?

A structure-aware document-AI model from Mistral AI, released June 23, 2026, that extracts text and structure from documents in 170 languages at $4 per 1,000 pages. It can run fully self-hosted in a single container, and topped document-AI benchmarks like OlmOCRBench (85.20).

Why does self-hosting matter for document AI?

It lets enterprises process sensitive documents on their own infrastructure so data never leaves their control — a key differentiator versus cloud-only document APIs from Google and AWS, alongside lower cited cost and latency.

Sources

Image: Mistral AI logo by Mistral AI — Public domain, via Wikimedia Commons.

#mistral-ai#ocr#document-ai#enterprise-ai#open-source#europe

← Back to all posts