Streaming parser
A parsing technique that reads a file incrementally in small chunks rather than loading the entire file into memory at once, enabling tools to open and index very large MBOX files — tens or hundreds of gigabytes — with low memory usage.
A streaming parser processes a file as a sequence of bytes or lines, maintaining only a small buffer and the current parsing state at any moment. This contrasts with a buffered approach that reads the entire file into memory before parsing begins. For MBOX files, a streaming parser can identify message boundaries (the "From " separator lines), extract headers, and record byte offsets without ever holding more than one message in memory at a time.
The practical benefit is that file size stops being a limitation. A 50 GB MBOX export from a years-long Gmail archive opens in the same way as a 1 MB test file — the parser reads through it sequentially, building a lightweight index of message positions, and then seeks directly to any message when you select it. Memory usage stays roughly constant regardless of archive size.
Mbox Viewer's streaming parser is designed for this use case. On the first open of an MBOX file, it streams through the file to build a binary index recording the byte offset and key metadata for each message. On subsequent opens, the index is loaded in under a second, so the parser only needs to re-stream messages you actually open.
Related terms
A compact index file that Mbox Viewer writes alongside an MBOX archive after the first parse, storing message byte offsets and metadata to enable near-instant reopens without re-scanning the entire file.
A plain-text file format that stores multiple email messages concatenated together, each beginning with a "From " separator line. It is the format Google Takeout produces when you export your Gmail archive.