The Personal Computing Paradigm
E-Mail Archiving with Eudora and Mail.app
Back in November, Ken Gruberman wrote about how to slim down your Outlook Express or Entourage mail database. As the databases get large, the software slows down, the files become harder to back up, and the potential for data loss increases. One option, of course, is to delete mail older than a specified age. Another is to keep only a small amount of old messages that you think you may need in the future. My philosophy is that it’s not worth the time and chance for mistakes to pick and choose which mail to save; I just save everything. By moving, or archiving, old messages outside the program’s mail database, you can make your mail client zippy again while still keeping the old messages available if you need to search them some day. This article will teach you how to archive old mail compactly and in a way that you will be able to find old messages when you need them.
Emailer, Entourage, Mailsmith, Outlook Express 5, and PowerMail are all clients that can benefit from e-mail archiving. All but Mailsmith store mail in a single database file. This means that they slow down when you have a lot of mail and that the database file is a single point of failure if you ever run into disk trouble. A single large file is hard to back up, and yet it’s all the more reason to backup frequently. If you use one of these clients you should definitely think about e-mail archiving.
Mailsmith, unlike the others, uses one database file for each mailbox. This makes it less susceptible to the above problems, but there are several reasons to keep its mail store from growing too large. First, the program slows down as you add more mailboxes, even if you aren’t viewing them. Second, Mailsmith sometimes modifies mailbox files that you aren’t using, so your incremental backup software will waste time and space backing them up even if they haven’t really changed. Third, its database files use about five times more disk space than other clients’.
Entourage Email Archive and other AppleScript-based solutions purport to solve the above problems by saving your e-mail messages to one of the following:
- One text file (in various formats) per message.
- One text file (in various formats) per mailbox.
- A FileMaker Pro database.
A lot of people like to do this, but I don’t find any of these solutions acceptable, because:
- Archiving tens or hundreds of thousands of messages to individual text files will slim down your mail database but slow down your file system.
- It’s not very convenient to browse messages stored in text files.
- FileMaker databases can’t store more than about 64K per field, so long messages will be truncated. Also, FileMaker databases, in my experience, are slow and unreliable when they get to be very large.
- With the above methods it’s easy to lose track of attachment files and which messages they were attached to.
- The tools for searching text files and FileMaker databases are not optimized for searching e-mails.
I propose that it’s better to archive old mail into another e-mail client. Eudora and Mail.app are both available for free and both have many advantages when dealing with vast quantities of mail. There are many reasons why you might prefer to use another program for your day to day mail. However, the criteria that make a program good for downloading, reading, and composing messages are for the most part quite different from those that make a program good for storing and searching large quantities of messages. Eudora and Mail.app have much nicer interfaces than any FileMaker database I’ve seen. BBEdit may be good at searching folders of text files, but it can’t compete with programs that were designed for searching mail.
Both Eudora and Mail.app have Import features, for bringing in messages from your primary e-mail client. Most other clients can export in mbox format, which both Eudora and Mail.app know how to read. The mbox format is standard and compact. It preserves attachments, but you will lose client-specific metadata such as message colors and the markers that show whether you’ve replied to or forwarded a message. Emailer doesn’t have a built-in mbox export feature, but Robert Shapiro has written an AppleScript to do the job. Entourage’s mbox export feature is hidden; you can drag a mail folder to the Finder to save it as an mbox file.
Where the Mail is Stored
Of course you will want to back up your e-mail archive, and to do that you’ll need to know where it’s stored. The current version of Eudora stores its data in the Eudora Folder inside your Documents folder. The actual messages are stored in the Mail Folder inside the Eudora Folder. Mail folders in Eudora correspond to folders inside the Mail Folder. If you like, you can replace the Mail Folder or any of the folders therein with an alias to another folder. In this way, you can store portions of your mail archive in separate places. For instance, I keep a separate folder outside Documents for mailbox files that I don’t want synchronized with my iBook (for lack of disk space).
Mail.app stores its data in the Mail folder of your Library folder. The actual messages are stored in the Mailbox folder inside the Mail folder. As with Eudora, mail folders in Mail.app correspond to Finder folders inside the Mailboxes folder. The alias trick also works with Mail.app except that rather than creating an alias you have to create a symbolic link (basically a Unix-style alias). The easiest way I’ve found to do this is to use Path Finder, but you can also do it using the ln -s command in Terminal.
How the Mail is Stored
Both Eudora and Mail.app store messages in the standard mbox format. This has the advantages of being compact and human-readable. That is, even if an mbox file gets corrupted you’ll still be able to read the intact parts by opening the file in BBEdit. Since the mbox format is standard, you’ll surely be able to find programs that can read it after your current Mac has been retired.
Eudora and Mail.app also store auxiliary information for each mailbox file, such as colors for the messages, the sort order, which messages you’ve replied to, and a “table of contents” so that they can display the list of messages without having to load the whole mbox file. What’s really neat is that both store this auxiliary information separate from the message data.
Eudora can either store it in the resource fork of the mbox file or in a separate, adjacent file. I recommend the latter, which you can enable by clicking “Use old-style ‘.toc’ files” in the Miscellaneous Settings. Using separate .toc files means that if you change the sort order of a mailbox or color one of the messages, Eudora won’t have to modify the mbox file that contains the messages themselves. Since this (much larger) file hasn’t been modified, your backup software won’t waste time or space backing it up again. Plus, if you want to save some disk space you can easily delete the .toc file without losing any essential information.
Mail.app stores each mailbox in a file package whose name ends with “.mbox.” A file package acts like a file, but it’s really a folder. You can see what’s inside by control-clicking in the Finder and choosing Show Package Contents. Inside the package is an Info.plist file, which holds the mailbox’s sort order, a “mbox” file that stores the message data (in mbox format), a table_of_contents file (much like a Eudora .toc file), and several mbox.SKindex files that store the indexing data that Mail.app uses to make searching faster. You can delete the index and table of contents files to save disk space, and they will automatically be re-created as needed.
Eudora and Mail.app scan their folders (Mail Folder and Mailboxes, respectively) to determine which mailboxes are available and how they are organized into subfolders. You can quit the mail program and re-arrange the mailboxes and sub-folders, and the changes will be reflected in the mail program. (Don’t do this if you have filter/rules set up, but if you use Eudora or Mail.app for archival only, this typically won’t be a problem.) You can move a mailbox file out of Mail Folder or Mailboxes if you want to store it elsewhere. To “re-attach” it, just drag it back into the appropriate folder and it will show up when you re-launch the program. (Additionally, with Eudora you can double-click a mailbox file outside of Mail Folder to open it directly.) If you’d rather save space without moving anything out of Mail Folder or Mailboxes, you can compress select mailbox files that you seldom use. Making them available again is as simple as decompressing the files and re-launching the mail program.
Whether you prefer Eudora or Mail.app for viewing old mail is a matter of taste. Here are a few criteria that I considered. Mail.app lets you set separate fonts for the message list and the message contents (I recommend Osaka 9 and ProFont 9), while Eudora makes you use the same font for both. Both support two-pane browsing. Mail.app supports three-pane browsing (via the Mailboxes drawer), while Eudora has a separate Mailboxes window and a Mailbox menu. Both show the standard columns in mail list windows.
In Eudora, you can reverse the sort direction by option-clicking on a column header. In Mail.app, simply click the column header again (like in the Finder). Eudora lets you sort by multiple columns at once. For instance, to view messages grouped by subject and sorted (within each subject) by date, you can click on the Subject column and then shift-click on the Date column. To can option-click on part of a message in a message list to select all the messages that are similar to the part you clicked on. For instance, option-clicking in the Who column (not on the column header) will select all the messages sent by the person whose name you clicked on and group them together. Mail.app doesn’t support these fancy message list tricks, but it does have the option to color-code messages that are in the same thread as the selected message.
Eudora lets you label important messages in various colors. Mail.app only lets you mark them as flagged or unflagged, although you can use the color panel to temporarily color messages.
Eudora is much faster than Mail.app at mail browsing tasks such as switching between different mailboxes. It hardly seems to slow down at all as individual mailboxes grow larger, though there is a limit of about 32760 messages per mailbox. With Mail.app, on the other hand, there can be a long delay when you switch from one mailbox to another as the program loads the message list. You can reduce the delay by keeping fewer messages in each mailbox and by leaving mailboxes sorted by Number or Status, to reduce the time it takes Mail.app to sort the message list. You can click the Stop button when Mail.app starts “Updating color for messages.” Also, if you know that you will want to view the same mailbox again, you can leave its window open and make a new Viewer window for viewing other mailboxes; that way, you can avoid the delay as Mail.app reloads the first mailbox. Although Mail.app is much slower than Eudora in absolute terms, its performance is improving with each release and it has the advantage of being very well threaded. You can browse a mailbox, search in another window, and transfer messages from one mailbox to another, all while Mail.app is indexing yet another mailbox. If your Mac has multiple processors, Mail.app can take advantage of them.
Searching with Eudora
Eudora has a sophisticated search feature for finding messages. You can select one or more mailboxes to restrict the search to only those mailboxes, and you can perform more than one search at a time. The search window itself lets you specify multiple criteria. Each criterion can match against a field such as Body, From, or Date, and you can search for words, phrases, and regular expressions. You can require that Eudora find messages that match all the criteria, or ones that match any one criterion. The search options are almost as powerful as those in Mailsmith, and searching is fast even though Eudora doesn’t rely on content indexes.
Searching with Mail.app
Mail.app’s search features are not as powerful as Eudora’s, but they are more powerful than they appear at first glance. Like Eudora, you can select one or more mailboxes to restrict the search to those mailboxes, and you can perform more than once search at once by opening multiple viewer windows. The pop-up menu at the left of the search box lets you specify the type of search as well as whether to search the selected mailboxes or all mailboxes.
An Entire Message search looks in the message headers and the message body. You type some words into the search box and Mail.app shows the matching messages, ordered by relevance. To get the most out of Entire Message searches, you need to know a little bit about how they work.
You are essentially searching by word. Mail.app will only find matches that begin at the start of a word. For instance, if you search for “str” you will find messages containing “structure” and “street” but not “astronomy.” An exception is that if a word contains capital letters; searching for “str” would find “MyStreet.” As you can see, search terms are case-insensitive.
Mail.app does not consider punctuation to be part of words. Searching for “firstname.lastname@example.org” is equivalent to typing “example domain com.” Since punctuation is ignored, you can’t search for technical terms like “<h1>” or “$/.”
In iTunes, entering several words separated by spaces finds only those songs that match all the words. In Mail.app, the opposite is the case; the search will find messages that match any of the words. The message with the highest relevance will not necessarily contain all the words; instead, it might have high relevance because it contains many occurrences of one of the words.
If you want to find messages that match all the words, you can separate them with “and.” For instance, you could search for “example and domain and com.” Note that this will not restrict the results to messages that contain the three words in that order; they can appear anywhere in the message so long as all are present. There is no way to do a phrase search, i.e. find messages that contain a sequence of words like “Mary had a little lamb.” In addition to “and,” you can connect words with “or,” and you can group them with parentheses. Searching for “screen and (iMac or iBook)” would find messages that contain the word “screen” as well as either “iMac” or “iBook” (or both).
Sometimes Mail.app will get confused and an Entire Message search won’t find messages that it should. In this case, you can often fix the problem by rebuilding the index file that Mail.app uses for Entire Message searches. The Rebuild Mailbox command does not do this. Instead, you should make a new mailbox and move all the messages to it, or open the mailbox’s file package in the Finder and delete the mbox.SKIndex files.
Besides Entire Message searches, Mail.app offers To, From, and Subject searches. These searches work differently from Entire Message searches. They are not word-based, “and” and “or” have no special meaning, and spaces and punctuation are not ignored. This means that you can search for phrases.
Advanced Searching with Mail.app
Mail.app’s search features are quick and easy to use, but they are not as powerful as you might wish. Here are some workarounds for doing advanced searches in Mail.app.
If you know the mailbox that contains the message you are looking for, it may be easiest to import that mailbox into an e-mail client that has a better search feature. This is particularly easy with Mailsmith, as you can simply drag Mail.app’s .mbox file into the Mailsmith mailbox list.
An intriguing option, if you don’t mind pre-release software, is Steven Frank’s Emila. Unfortunately, the current version of Emila must import all your Mail.app mailboxes at once. This takes a long time and uses a lot of memory and disk space.
The Mail.app that comes with Mac OS X 10.2 has much improved rules that support multiple criteria. You can mimic many of Eudora’s complex searches by creating a rule that flags messages that match the criteria of your search. You can then select the messages you want to search, use the Apply Rules to Selection command, and then sort by the Flags column to see which messages matched. Of course, you will want to first disable any other rules that you have.
Mail.app’s rules don’t have a Date criterion, but you can restrict matches to a particular date range by writing an AppleScript that unflags messages outside the desired date range. (Or, if there aren’t many matches, you could sort by Date and find the messages you want by inspection.)
Eudora or Mail.app?
Both Eudora and Mail.app can handle large amounts of mail and store it compactly. Eudora is much faster and provides more powerful searching options. Mail.app’s iTunes-like search interface is easier to use for quick searches, and if you have trouble remembering the message you’re searching for you may find its relevance-ranked searches helpful. In my experience, Eudora sometimes parsed imported mbox files incorrectly, making it impossible to view (or search for) parts of certain messages. Also, it sometimes didn’t let me extract the attachments from imported messages. Most people probably won’t experience these troubles, but they caused me to switch to Mail.app. Overall, I think Eudora is better suited for e-mail archiving, but you can’t go wrong with either program.
Also in This Series
- How Cool Is Your Mac? · May 2012
- Mac OS X’s Increasing Stability · August 2006
- Coping With Mac OS X’s Font Rendering · January 2006
- E-Mail Archiving with Eudora and Mail.app · January 2003
- Grab Bag · October 2002
- Mac OS X 10.2—First Impressions · September 2002
- Mac OS X 10.1—First Impressions · October 2001
- Mac OS X Tips · June 2001
- Mac OS X—Finally · May 2001
- Complete Archive