datatheftA big part of my practice is assisting courts and lawyers in cases where it’s alleged that a departing employee has walked off with proprietary data. There’s quite a lot of that. Studies in the U.S. and abroad suggest that some two-thirds of departing white collar employees leave with proprietary data. So, it seems data theft is the norm.

Of course, not all data leaves with the requisite scienter (“evil intent”) to be called theft. In this wired world, who doesn’t have data on thumb drives, phones, tablets, backup drives, webmail accounts, legacy devices, media cards, CDs, DVDs, floppy disks and good ol’ paper? You work for a company a while and you’re going to end up with their stuff strewn all over your devices and repositories. But, few data theft lawsuits stem from stale data on forgotten media.

The “classic” data theft scenario is the after-hours mass movement of copious quantities of closely-guarded internal documents to an external USB hard drive or capacious thumb drive. While such actions look dastardly at first blush, a few dimmer bulbs may actually act with a pure heart, intending to take only their personal data (like family photos or music), but dragging entire folder families that also hold corporate ESI.

I tend to be skeptical of such claims unless the usage patterns that follow and other forensic evidence bear out the “I really thought it was just my stuff” defense.  It’s not hard to tell the difference, so long as devices aren’t lost or corrupted.

But you may be wondering: How do forensic examiners determine data was taken, and how do they identify and track storage devices used to carry away ESI?

This post is offered as a general introduction to selected aspects of Windows Registry and artifact analysis and peculiarities of Windows MAC dates and times. The goal is to introduce you to same, not equip you to conduct forensic exams or march into court assuming this is all you need to know.  With that fainthearted disclaimer behind us….

Computer Forensics: a Confluence of Happy Accidents

You can roughly divide the evidence in a computer forensic examination between evidence generated or collected by a user (e.g., an Excel spreadsheet or downloaded photo) and evidence created by the system which serves to supply the context required to authenticate and weigh user-generated evidence. User-generated or -collected evidence tends to speak for itself without need of expert interpretation. In contrast, artifacts created by the system require expert interpretation, in part because such artifacts exists to serve purposes having nothing to do with logging a user’s behavior for use as evidence in court. Most forensic artifacts arise as a consequence of a software developer’s effort to supply a better user experience and improve system performance. Their probative value in court is a happy accident.

For example, on Microsoft Windows systems, a forensic examiner may look to machine-generated artifacts called LNK files, prefetch records and Registry keys to determine what files and applications a user accessed and what storage devices a user attached to the system.

LNK files (pronounced “link” and named for their file extension) serve as pointers or “shortcuts” to other files. They are similar to shortcuts users create to conveniently launch files and applications; but, these LNK files aren’t user-created. Instead, the computer’s file system routinely creates them to facilitate access to recently used files and stores them in the user’s RECENT folder. Each LNK file contains information about its target file that endures even when the target file is deleted, including times, size, location and an identifier for the target file’s storage medium. I’m sure Microsoft didn’t intend that Windows retain information about deleted files in orphaned shortcuts; but, there’s the happy accident–or maybe not so happy, if you are the one caught in a lie because your computer was trying to better serve you.

Similarly, Windows seeks to improve system performance by tracking the recency and frequency with which applications are run. If the system knows what applications are most likely to be run, it can “fetch” the programming code those applications need in advance and pre-load them into memory, speeding the execution of the program. Thus, records of the last 128 programs run are stored in series of so-called “prefetch” files. Because the metadata values for these prefetch files coincide with use of the associated program, by another happy accident, forensic examiners may attest to, say, the time and date a file wiping application was used to destroy evidence of data theft.

Two final examples of how much forensically-significant evidence derives from happy accidents are the USBSTOR and DeviceClasses records found in the Windows System Registry hive. The Windows Registry is the central database that stores configuration information for the system and installed applications—it’s essentially everything the operating system needs to “remember” to set itself up and manage hardware and software. The Windows Registry is huge and complex. Each time a user boots a Windows machine, the registry is assembled from a group of files called “hives.” Most hives are stored on the boot drive as discrete files and one—the Hardware hive—is created anew each time the machine inventories the hardware it sees on boot.

When a user connects an external mass storage device like a portable hard drive or flash drive to a USB port, the system must load the proper device drivers to enable the system and device to communicate. To eliminate the need to manually configure drivers, devices have evolved to support so-called Plug and Play capabilities. Thus, when a user connects a USB storage device to a Windows system, Windows interrogates the device, determines what driver to use and—importantly—records information about the device and driver pairing within a series of keys stored in the ENUM/USBSTOR and the DeviceClasses “keys” of the System Registry hive. In this process, Windows tends to store the date and time of both the earliest and latest attachments of the USB storage device.

Windows is not recording the attachment of flash drives and external hard drives to enable forensic examiners to determine when employees attached storage devices to steal data! Presumably, the programmer’s goal was to speed selection of the right drivers the next time the USB devices were attached; but, the happy accident is that the data retained for a non-forensic purpose carries enormous probative value when properly interpreted and validated by a qualified examiner.

The Scene of the “Crime”

Imagine you’re an employee who has grown disenchanted with your employer.  Perhaps the company you helped build has changed management, and you feel marginalized.  Maybe you were RIF’ed or passed over for promotion or your latest bonus was a disappointment.  For whatever reason, your eye is on the door.  Then, opportunity knocks.  Maybe a competitor or former co-worker calls with a job offer, or you decide to launch a competing venture.  Perhaps you’re just a sentimental soul and want to keep a complete record of all the fine-but-under-appreciated work you slaved to produce for your soon-to-be-former employer.

See, most people aren’t data thieves by nature.  A fair amount of self-serving rationalization goes into getting them to the point of persuading themselves it’s not really stealing…not exactly.  It’s more like protecting the fruits of their labor.  Of course, if they didn’t think it was questionable behavior, they could have sought permission, and they sure wouldn’t have come in late at night to make the copies or gone to other lengths to cover their tracks.

Do you get the feeling I’ve investigated a bunch of these cases?

So, my dear data pirate, now that you’ve decided to liberate your data, how will you package it to go?  Will you e-mail selected items to your personal web mail account?  Will you burn a CD?  Or will you copy selected folders to an external USB device like a thumb drive or hard drive?  The last method is the one most often seen.

But before you start moving data, you’ve got to figure out what you want to take with you.  You’ll probably start opening folders and checking the contents of files.   You may begin amassing the files you want to take in a folder or Zip file or perhaps you’ll drag your entire Documents folder to an external drive.  In any event, examiners often see evidence of particularized interest in proprietary files manifested as LNK files in the user’s RECENT subfolder.  These LNK files offer clues to what was attached, what was accessed and when.  Examiners may also see MRU (for Most Recently Used) file records in the Registry for various file types of interest.

Another clue that company data was copied to an external storage device is the timeline of events that can be gleaned from examining the system metadata values of files and folders.  Years ago, such a timeline in a data theft case might have been constructed primarily from Last Accessed dates, reflecting the last time a file was “touched” by the user or operating system.  That’s less feasible today; here’s why:

All computer operating systems employ a set of routines called the File System that serve as the prosaic plumbing handling routine file management tasks.  The Windows family of PC operating systems has long employed a file system called NTFS (for New Technology File System).  NTFS tracks and records several date and time values for each file it manages.  These have customarily been called MAC dates, for Last Modified, Last Accessed and Created dates, but could as easily be termed MACE dates, acknowledging another time value called the Entry Modified date.

MAC Dates are a frequent source of confusion in computer forensics and e-discovery because the meaning accorded Created and Accessed in common parlance is off a bit from their meanings in Windows World.  In NTFS, a document’s Created date could coincide with the date the document was authored by the user, but it could also signify when a template used to create a document was authored by someone else or when the document was copied from another disk or drive (as “Created”  means created on that storage volume).

Copy a file you authored on Monday to an external hard drive on Friday and the file’s Created Date on the external hard drive will likely be Friday (unless you used a copying tool that changes the date back to Monday).  See how flaky this can get if you’re not sure what you’re seeing?

The flakiest of the MAC dates is the Last Accessed date.  In the past, the Last Accessed date signified when a user last opened a file or the last time an antivirus program examined the file for malware, or even the last time the file was previewed while exploring a list of files in Windows Explorer.  Last Accessed dates were being updated all the time for a host of reasons, and all that updating consumed computing resources, slowing things down.

Slow doesn’t sell computers, so Microsoft hated all that automatic updating.  Accordingly, when Microsoft introduced its Vista operating system about six years ago, it unceremoniously turned off the routine updating of Last Accessed dates.  A user could tinker with the Registry and turn it back on; but if the user doesn’t know updating is turned off, what are the odds a user will edit the Registry to turn it on?  Who but forensic examiners would know or care?

Thus, an unreliable metadata value got even more unreliable in Vista, Windows 7 and Windows 8.  Windows XP updated Last Accessed dates constantly, and Windows 7 updates them sporadically, such as when files are modified.  If an external drive is attached to multiple machines running different versions of Windows, the metadata values will update inconsistently.  In sum, updated Last Accessed dates may reliably confirm that a file was touched by a user or machine, but the absence of updated values can’t reliably rule it out.

Okay, Byte Bandit, you’ve assembled what you want to take and unpacked that brand new Seagate Free Agent GoFlex drive you got at Best Buy for ninety bucks, marveling that one terabyte of data can fit onto something the size of a deck of cards.  You plug the GoFlex into the USB port of your company laptop and wait for Windows to tell you that it sees the drive and ask you what you want to do with it.

In the brief time that the drive spun up, Windows said “Who goes there?” and the drive responded with its name, rank and serial number; that is, with enough information that Windows could locate and load the right driver to allow the laptop to communicate with this new USB mass STORage device.  Armed with this information, Windows dutifully makes a record of the attachment in a Registry key aptly called “USBSTOR” as well as several other Registry keys and in a log called C:\Windows\inf\  Windows also records the date and time of this first attachment in Universal Coordinated Time (UTC), which is to say almost in Greenwich Mean Time.  Next time someone asks, “What time was it in Leicester Square when you first used your Sandisk Cruzer thumb drive?” you’re all set!”  Finally, if Windows cannot determine the drive’s serial number, it just makes one up.  Really. You just gotta love Windows!  To be fair, it makes up a serial number for a perfectly valid reason, but it’s made up all the same.

If you’d like to see what the list of connected USB devices looks like on your machine, simply click the Start button on your Windows machine (unless you’re using Windows 8 and can’t find the Start button anymore) and enter “Regedit.”  When the Registry Editor launches, drill down through HKEY_LOCAL_MACHINE to SYSTEM and select any numbered ControlSet, then drill down further through ENUM to USBSTOR.  You should see something that looks like the image below.  (If it looks exactly like this, GET THE HECK OUT OF MY HOUSE!)Registry USBSTOR

Your Registry Editor won’t list the attachment times in USBSTOR; however, tools used by forensic examiners parse the various keys to assemble the data generally permitting a reliable determination of the first and last attachment dates and times (along with other data) for specific USB mass storage devices.  [DO NOT edit your Registry unless you know EXACTLY what you’re doing.  Just close the window.]

There you have it, a simplified explanation of some of the methods forensic examiners use to track and trace data theft in Windows systems.  When seeking to preserve ESI in cases of suspected data theft, remember that the lion’s share of the forensically revealing data is not stored on the media used to transfer the data but principally resides on the systems to which the storage medium attached.  Endeavor to secure all such evidence items and have them forensically imaged by a qualified examiner before exploring contents.

Lawyers interested in more information on this topic might enjoy my (free) First Responder’s Guide to Employee Data Theft.  Examiners interested in more about Windows Registry analysis should see Harlan Carvey’s excellent Windows Registry Forensics: Advanced Digital Forensic Analysis of the Windows Registry and download a copy of the also excellent (free) SANS Windows Artifact Analysis poster