A Field Guide to Email Storage Formats: History, Anatomy and a Comparison
How email is actually stored on disk — MBOX, Maildir, EML, PST, OST, OLM, MSG, NSF and historical formats like Eudora. Their history, how they're built, what each one is good for, and a side-by-side comparison.
David Carrero ·
Every email program has to answer the same question: where do the messages actually go on disk? Half a century of answers has produced a small zoo of formats — some open and beautifully simple, others proprietary databases you can’t read without the app that made them. This is a tour of the ones you’ll meet, how they’re built, where they came from, and how they stack up.
Broadly, they fall into three families:
- Open, text-based containers — one file holds many messages (MBOX), or one file per message in a folder (Maildir). Human-readable, vendor-neutral.
- One message per file — a single message as a standalone file (EML, MSG).
- Proprietary databases — a binary store holding mail plus calendar, contacts and state (PST, OST, OLM, NSF). Compact inside their app, opaque outside it.
The open, text-based formats
MBOX — the lingua franca
MBOX dates back to the early Unix mail systems of the 1970s. The idea is disarmingly simple: concatenate every message in a mailbox into one plain-text file, and mark where each one starts with a line beginning From (the “From_ line”, with a space, not the From: header). Headers, body and attachments — encoded as text — all live inline.
That simplicity hides a famous wrinkle: what happens when a message body itself contains a line starting with “From ”? Different answers gave rise to variants — mboxo, mboxrd, mboxcl and mboxcl2 — which escape (or don’t) that sequence differently. In practice modern tools read them all. MBOX is what Google Takeout, Apple Mail, Thunderbird and most classic clients export, which makes it the closest thing email has to a universal archive format.
Maildir — one file per message
Created for the qmail server in 1995, Maildir takes the opposite approach: every message is its own file inside a folder, split across tmp/, new/ and cur/ subdirectories. Its great virtue is safety without locking — two processes can deliver mail at the same time without corrupting a shared file, the classic risk with MBOX. It’s the native format of servers like Dovecot and Courier. The trade-off is millions of tiny files, which some filesystems dislike.
EML — a single message, the way the internet defines it
EML is one message saved exactly as it travels: the raw MIME structure defined by the email RFCs (822 → 2822 → 5322). Headers at the top, then the body and attachments encoded in MIME parts. Because it is the on-the-wire format, almost everything can produce and read it — Outlook, Thunderbird, ticketing systems, scanners and mail servers. A folder of .eml files is the simplest possible archive.
The proprietary databases
PST — Outlook’s personal store on Windows
PST (Personal Storage Table) is Microsoft Outlook’s on-disk database on Windows, built on the MAPI model. It holds far more than mail — calendar, contacts, tasks, notes — in one binary file. The original ANSI PST (Outlook 97–2002) capped out at 2 GB and was prone to corruption near that limit; the Unicode PST (Outlook 2003+) raised it to 20–50 GB. Fast and compact inside Outlook, but useless to other apps without conversion.
OST — the offline cache
OST (Offline Storage Table) is PST’s sibling: a cached copy of a mailbox that lives on an Exchange or Microsoft 365 server. It exists so Outlook works offline and re-syncs later. Crucially, an OST is tied to its account and profile — it isn’t a portable archive, and orphaned OST files can be hard to open at all.
OLM — Outlook for Mac
OLM is the export/archive format of Outlook for Mac. Same intent as PST, different container — a proprietary bundle that, like PST, needs converting before anything but Outlook can read it.
MSG — a single Outlook message
MSG is one message exported from Outlook, stored as an OLE “compound file” (a mini-filesystem inside a file) carrying MAPI properties. It’s the Windows counterpart to EML, but binary and Microsoft-specific.
NSF — Lotus Notes / HCL Domino
NSF (Notes Storage Facility) is the database behind IBM/Lotus Notes (now HCL Domino) — an entire application platform, not just mail. NSF archives still surface in long-running enterprises and, like the others here, require dedicated tools to extract.
The historical ones
- Eudora (1988–2006) was the dominant client of the early internet era. It stored mail in
.mbxmailbox files — essentially MBOX — paired with a.toctable-of-contents index. Because the body is MBOX-like text, Eudora archives are usually recoverable today. - Outlook Express used
.dbxfiles (one per folder) on Windows through the late 1990s and 2000s; its successor Windows Mail / Live Mail moved to individual.emlfiles. - Netscape/Mozilla mail, Evolution, Claws Mail, Entourage and others stored or exported MBOX — which is exactly why MBOX remains so widely readable.
Side-by-side
| Format | Structure | Open? | Portable archive? | Origin |
|---|---|---|---|---|
| MBOX | One text file, many messages | ✅ Open | ✅ Excellent | Unix, 1970s |
| Maildir | One file per message, in folders | ✅ Open | ✅ Good | qmail, 1995 |
| EML | One message, raw MIME | ✅ Open | ✅ Excellent | Internet RFCs |
| MSG | One message, OLE compound | ❌ Proprietary | ⚠️ Limited | Microsoft |
| PST | Binary database (mail + PIM) | ❌ Proprietary | ⚠️ Convert first | Outlook (Win) |
| OST | Cached server mailbox | ❌ Proprietary | ❌ Account-bound | Outlook/Exchange |
| OLM | Proprietary bundle | ❌ Proprietary | ⚠️ Convert first | Outlook (Mac) |
| NSF | Application database | ❌ Proprietary | ⚠️ Convert first | Lotus Notes |
What to choose for the long term
For archiving — keeping mail readable for decades — the open, text-based formats win every time. MBOX and EML have no vendor, no licence and no database engine to go obsolete: in twenty years they’ll still be plain text any tool can open. That’s why, if you ever get to pick an export format, MBOX (or a folder of EML files) is the safe choice, and why converting PST/OLM to MBOX future-proofs an Outlook archive.
Once your mail is in MBOX or EML, Mbox Viewer opens it on Mac and Windows — any size, read-only, fully offline. For the practical “which file can I open and how” version of this guide, see MBOX, EML, PST, OLM: Email Archive Formats Explained; to turn an Outlook file into MBOX, see how to convert PST/OLM to MBOX.
Open your archive with Mbox Viewer
Native Mac and Windows app. Streams MBOX and EML files of any size, fully offline.