Message-ID
A globally unique identifier assigned to each email message, specified in the Message-ID header. It is used to track messages, build conversation threads, and detect duplicates when merging archives.
The Message-ID header (RFC 5322 section 3.6.4) contains a string that is intended to be unique across all email ever sent, typically formatted as a local-part@domain string such as <[email protected]>. The sending mail server generates this identifier at the moment of transmission. Replies include the original message's Message-ID in their In-Reply-To and References headers to link the conversation.
Message-ID is the primary key used by threading algorithms to reconstruct conversations. It is also used during deduplication: when merging two MBOX files that may overlap — for example, two Google Takeout exports from different dates — comparing Message-IDs allows the application to identify and skip messages that already exist in the target archive.
In rare cases, Message-IDs may be missing (from very old messages) or duplicated (from buggy sending software). A robust archive tool handles these edge cases by falling back to heuristic matching on other headers such as Date, From, and Subject when a Message-ID is absent or unreliable.
Related terms
Email headers (In-Reply-To and References) that link a reply to the message it responds to, enabling mail clients and archive tools to group related messages into conversation threads.
The process of detecting and removing duplicate email messages from an archive, typically by comparing Message-ID values, to avoid redundancy when merging multiple MBOX files.
The process of grouping related email messages into conversations by following In-Reply-To and References header links, typically using the JWZ algorithm that supports up to four levels of nesting.