I’m surprised how frequently I’m engaged to collect the contents of Gmail accounts in e-discovery, especially when the account is being collected solely for preservation, and there’s no compelling reason to entrust the task to a neutral. I appreciate that hiring an expert offers greater assurance that the task will be approached with skill and experience, as well as that integrity of process can be supported by the testimony of someone unconnected with the client or law firm. But, though collecting and validating the complete contents of a Gmail account can be tricky and tedious, it’s not all that difficult to do. Happily, unless you do something really dumb, it’s unlikely that even a botched Gmail collection effort will harm the contents of the account.
For those seeking a low-cost, defensible mechanism to preserve Gmail content, this (long, dry) post lays out a detailed methodology for collection and preservation of the contents of a Gmail webmail account in the static form of a standard Outlook PST container file. I will address various technical considerations, but few legal ones. Whether or not the methods described in this post are legally sufficient in your case or compliant with Gmail’s terms of service is not my call, and I offer no opinions about same.
[NOTE TO READERS 10/14/14: When I wrote this post, there was not yet a backup capability built into Gmail. Google now makes data tools available that support the creation of a rich archive of a user’s Google content, including, Gmail, Contacts, Calendar and Google Drive. You can find it the Archive section of https://www.google.com/settings/datatools when logged into Google and can read more about it here.]
Credentials and Consent
To download the contents of a Gmail account, you’re going to need both account credentials (user name and password) and express, informed, contemporaneous consent, preferably in writing. Sure, you can get into an account with credentials alone, but absent express, informed, contemporaneous consent granting access for the purpose of collecting the contents of the account, you’re flirting with disaster, or even incarceration. There’s no fooling around on this. Don’t rationalize that consent is implied because you’ve long known the password or that some other noble end justifies the shabby means. Without express, informed, contemporaneous consent, DO NOT access someone else’s Gmail account using their credentials.
A satisfactory written consent doesn’t have to be complex or legalistic, it might say something like:
“I authorize [PERSON COLLECTING] to access my e-mail account(s) using the credentials I supply. I understand that my e-mail and other electronically stored information will be collected from my e-mail account(s) for the purpose of preserving and producing the information in connection with certain claims or litigation, and that the e-mail and information will be furnished to my attorneys. The permission granted herein to access and collect from my account(s) shall expire and be withdrawn on [DATE].”
Tips on Credentials
- Consider having the account holder change their customary Gmail password to a temporary password for the duration of the account collection effort. That way, when collection is complete, the account holder can return to using his or her preferred password without fear that it’s been compromised by disclosing it for purposes of collection.
- Test the credentials promptly to insure they work. Often, I’m furnished credentials that don’t work until we figure out that the uppercase “I” (“eye”) is actually a lowercase “l” (“ell”) or that V (Victor) must have sounded a whole lot like a B (Bravo) when counsel wrote it down.
- Some users configure Gmail for 2-step verification (also called two-factor authentication). This more secure log in method ties Gmail access to specific machines unless an additional code obtained by phone or text messaging is also supplied. In that event, you can have the user generate what Google calls an “application-specific password” and supply it to the person doing the collection. To set up an application-specific password, the user should visit his or her Google Account settings page at https://www.google.com/settings/account, then, on the left, click Security. Under the “2-step verification” topic, the user should click “Manage your application specific passwords,” enter a descriptive name for the password (e.g., IMAP collection), then click “Generate application-specific password.” The 16 character application-specific password generated in this way can then be revoked by the user once collection is complete.
IMAP Collection Tools
Gmail supports downloading of account contents via POP3 and IMAP protocols. Because POP3 is limited with respect to collection from folders beyond the Inbox, I prefer IMAP as my Gmail collection protocol.
There are several low-cost tools well-suited to collection of IMAP messaging, including e.g., Prooffinder and Aid4mail. But I will outline how to do a Gmail collection using Outlook 2010 via IMAP because most Windows users already have a copy of Microsoft Outlook 2010 and the Outlook .PST container format is universally supported by all competent e-discovery service providers and advanced review platforms. All of the capabilities and configuration settings I address respecting Outlook 2010 also are present in Outlook 2007 as well; you just won’t find them all in precisely the same way.
Will Outlook Change the Evidence?
Yes, it will, somewhat. Outlook cannot replicate, feature-for-feature, all of the content and appearance of Gmail. Outlook will not thread messages as conversations in the same manner as Google; but then, who can do that as well as Google? From the standpoint of the integrity of message bodies, header data and attachment integrity, Outlook will do a bang up job preserving the content and features that tend to matter in e-discovery. In short, the things lawyers and judges care about being preserved will be preserved in a properly collected Outlook .PST container file.
Set Up Outlook Accounts
Since you will want Outlook to create a separate .PST container for the contents of each Gmail account, you will need to set up separate Outlook accounts for each account to be collected. To do so in Outlook, go to File>Account Settings and, in the E-mail tab, select “New.”
In the next screen, enter the name of the account custodian from whom you are collecting and that custodian’s Gmail address. Don’t worry about adding passwords here. Check the radio box labeled, “Manually configure server settings or additional server types” and click “Next.” In the next dialogue box, select “Internet E-mail” and “Next.”
In the Internet E-mail Settings dialogue box, the correct name and e-mail address for the Gmail account holder should already be populated. If not, add it. Be sure the Account Type is set to IMAP (not POP3) and set the Incoming mail server to imap.gmail.com and the Outgoing mail server to smtp.gmail.com. In the Logon Information area, add the Gmail account holder’s user name and password. Be sure to clear the check in the box for “Test Account Settings by clicking the Next button.”
Don’t click Next; instead, click the “More Settings” button.
In the More Internet E-mail Settings dialogue box (right), select the “Advanced” tab and set the Incoming and Outgoing Server Port Numbers as follows:
Incoming server (IMAP): 993
Outgoing server (SMTP): 45
Set both encrypted connection settings to SSL
Click “OK” to return to the Internet E-mail Settings dialogue box, and now click “Next.”
With luck, you will be greeted by the “Congratulations!” dialogue box and can now either add the next Gmail account to be collected (Add Another Account) or click “Finish.”
Three Hurdles: Message Bodies, Pictures and Folders
Ideally, having Outlook download the complete contents of a Gmail collection would be a foolproof, set-it-and-forget-it endeavor. My experience with large Gmail collections has never been so. With care, it works; but it takes longer than I like and, if you’re not careful, you may imagine you’ve captured all the content but left a lot behind.
For expediency in reviewing e-mail, Outlook treats message bodies and message headers separately when accessing IMAP accounts. Message headers only hold the dog tag data for the message (i.e., To, From, Date, CC and Subject). Because message headers download rapidly and suffice to distinguish urgent messages from less pressing missives, Outlook and other mail clients initially download just message headers and do not acquire message bodies or attachments until you open or preview the message. On a fast internet connection, the user is largely oblivious to the short delay this entails and benefits from speedier access overall; but, when preserving data for discovery, don’t assume you’ve acquired the entire contents of every message (header, message body and attachments) when all you may have are the headers for some or all of the items in the account.
The difference is significant in terms of risk, but also in terms of time. It takes longer—sometimes days longer—to download the entire contents of a large Gmail collection using Outlook and IMAP versus message headers alone. There are steps you can take to minimize the risk, but there’s not much you can do to shorten the time.
Downloading Message Bodies: You can change Outlook’s default behavior in downloading only message headers to downloading the full contents of messages. To do so, select Send/Receive from the Outlook menu bar and click on Send/Receive Groups, choosing “Define Send/Receive Groups.” My practice is to define separate Send/Receive groups for each separate account (or related group of accounts, e.g., husband and wife) for each matter. If you’re using a never-configured copy of Outlook, doing isn’t essential; however, as I may use Outlook to collect the contents of different accounts in different cases, I find it efficient to create separate Send/Receive groups so I can initiate a collection (or an update) of one account without triggering action in others.
Highlight the Send/Receive group you want to configure, then click “Edit.” In the Send/Receive Settings dialogue box that appears, select the mail account you want to configure and, under the option, “Receive Mail Items,” select “Use the custom behavior defined below.” Under Folder Options, select the folders you want to collect by checking their boxes (be sure to select “All Mail” for a Gmail account) and choose the option, “Download complete item including attachments.” Click OK to save your changes. Close Send/Receive Groups.
In a perfect world, the settings just applied would suffice to retrieve message headers, bodies and attachments; however, my experience is that you must carefully ascertain whether you have all the components of the messages in the account before assuming acquisition is complete. Fortunately, Outlook makes it easy to sort by message header status, segregating headers with contents that have been downloaded from those merely available for download. For the latter items, I’ve ofttimes had to get the job done by selecting these stubborn items and clicking, “Mark to Download” on the ribbon. Until all items show up as downloaded items, you’re not done.
Downloading Embedded Pictures: As a security measure, Outlook is configured by default to not download pictures embedded in HTML message bodies. Accordingly, when collecting mail using Outlook, change the program settings in File>Options>Trust Center>Trust Center Settings to force the download of pictures by unchecking the setting, “Don’t download pictures automatically in HTML e-mail messages or RSS items.”
Subscribing to IMAP Folders: POP3 will typically download only from the Inbox, and even then, may collect incompletely. IMAP can collect from folders beyond the Inbox and can reproduce the folder structure. To insure, that Outlook is both capturing the account folder structure and downloading the content the various folders, it may be necessary to collect and subscribe to the folders in Outlook. To do this, locate the account name in the far left Navigation Pane in Outlook and right click on it. Select IMAP Folders from the menu. Click the Query button to download a list of folders. Check the contents of the “Subscribed” tab to ensure Outlook is subscribed to all of the folders whose contents you wish to collect. If not, select the folder from the “All” tab, highlight the unsubscribed folder and click the “Subscribe” button.
One of Gmail’s great strengths online is also something of an Achilles’ heel when doing Outlook IMAP collections. When you folder a message in Gmail, the message is only virtually added to a folder. But, when you replicate the folder structure using Outlook, every message in every folder is physically duplicated within the All Mail folder, and any message that populates multiple folders in Gmail online is replicated in the various folders offline. The result is that collecting from Gmail takes longer (as identical items are repeatedly download to the replicated folders in which they reside) and produces a much larger Outlook PST offline than the same volume of e-mail online.
So, you face something of a Hobson’s choice when collecting Gmail. If you want the folder structure replicated, it comes at the cost of significant delay and redundancy. To collect more quickly and efficiently from All Mail alone, you lose the folder structure. So far, this unfortunate trade off appears unavoidable.
Message Counts: I like to know how many messages are in each downloaded folder in Outlook; but by default, Outlook displays the total for unread messages. You can change this by right-clicking on each folder and selecting Properties. In the “General” tab of the Properties dialogue box, change the option from “Show number of unread items” to “Show total number of items.”
Disable the Marking of Previewed Items as Read: By default, Outlook is set to show any message that’s previewed for more than 5 seconds as having been read. Since you will want to preserve, as feasible, the read status that reflects the account holder’s actions and not your own, you should disable this feature before examining the contents of the messages. To do so, go to File>Options>Mail>Outlook Panes and click the Reading panes button. In the Reading Pane dialogue box, uncheck the option for “Mark items as read when viewed in Reading Pane.”
Quality Assurance: If all goes well, the Outlook PST file for the account being collected will swell with Gmail content. You can find the location of the PST file by File>Account Settings, then clicking on the Data Files tab of the Account Settings dialogue box. Now, click Open File Location to pull up the folder. When you’re confident the work is done and the PST faithfully reflects the full contents of the account and folders sought to be preserved, shut down Outlook and make an archival copy of the PST to new media. It’s always a good idea to test a working copy of the PST you’ve created to be sure that it can be read by your e-discovery tool of choice.
Two key quality assurance tasks to undertake before wrapping up collection are checking folder message counts and confirming download status. To check message counts, simply ascertain the total number of items in each folder (see Message Counts above) and check each of these against the number of items in each folder in the online Gmail account. Be very careful when logged into the online Gmail account as you can effect permanent changes to the evidence, and you are logged in as the user, potentially prompting others who can view the user’s status to think the user is online. Tiptoe! In a very dynamic Gmail environment, message counts may change rapidly, and you may have to accept some minor variation in the Inbox based on late arrivals. Otherwise, folder message counts should match exactly.
A second crucial quality assurance step is to ascertain that all message bodies have been downloaded and that you have no headers lacking message bodies and attachments. One way to accomplish this is to group the contents of your main Outlook pane by availability as well as add an IMAP status column (which will show whether or not the item has been marked for downloading). If you identify either a group of items not downloaded or a group marked for download, you likely have an incomplete collection.