I love solving puzzles. I come by it honestly. My late mother was a nationally ranked New York Times crossword puzzler, and though I lack her prodigious gifts, I start each morning racing on the Times crossword. I mention puzzling to note that the best part of my forensics work is finding the answer to electronic evidence puzzles. This week’s challenge comes from a legal assistant caught between a rock and a hard place, actually between the plaintiff and defense counsel. The defense objected that photos produced in discovery lacked metadata, while the plaintiff insisted the photos he had furnished contained the “missing” metadata. How could they both be right? The mystified legal assistant had simply saved the photos from the transmitting message and sent them on to the other side. She hadn’t removed any metadata. Or had she?
I had to figure out what happened and keep it from happening again.
First, some technical underpinnings:
What do we mean by metadata? Digital photos, particularly those taken with cell phone cameras, hold more information than shows up in the pretty pictures. Stored within the photos is a type of application metadata called EXIF (for Exchangeable Image File Format). EXIF holds camera settings, including the make and model of the camera or phone, time and date information, geolocation coordinates and more. Because it’s application metadata, it’s content stored within the file and moves with the file when copied or transmitted…unless someone or something makes it disappear.
There’s a second sort of metadata called system metadata, It’s context; data about the file that’s stored without the file, typically in the system’s file table that serves as a directory of electronically stored information. System metadata includes such things as a file’s name, location, modified and created dates and more. Because it’s stored outside a file, it doesn’t move with the file but must be rounded up when a file is copied or transmitted. Precious little system metadata follows a file when it’s e-mailed, often just the file’s name, size and type (although Apple systems include the file’s last modified and created dates).
The defense was seeing dates and times for photos that did not line up with the actual dates and times the photos were taken. Too, the camera and geolocation data that should have been in the EXIF segments of the pictures were gone when plaintiffs produced them.
Picture formats and EXIF metadata: The photos produced were taken with an iPhone and stored on a Mac computer. When most of us think of digital photos, we probably think of JPEG images stored as files with the extension .JPG. The JPEG photo format has been around for almost thirty years and been the most common format for much of that time. JPEG is what’s termed “lossy compression” referring to its ability to make image files smaller in size by jettisoning parts of the image that contribute to resolution and detail. The more tightly you compress a JPEG image (and the more often you do it), the “jaggier” and more distorted the image becomes.
As digital cameras have improved, digital photographs have grown larger in size, eating up storage space. Two-thirds of the data on my iPhone are photographs. Seeking a more efficient way to store images and video, Apple started phasing out JPEG images in 2017. The replacement was a format called High Efficiency Image File Format which, as implemented by Apple, photos are stored as High-Efficiency Image Containers with the file extension .HEIC.
The benefit is that, for comparable image quality, HEIC images are roughly half the size of JPEG images, and they hold EXIF data. The downside is that most of the world still expects a picture to be a JPEG and the Windows and Cloud realms need time to catch up. To remain compatible with other devices and operating systems, Apple converts HEIC images to JPEGs for sharing via e-mail.
Now, there’s something to consider! Did Apple strip out the EXIF metadata from the HEIC photos when it converted them to JPEGs? Hold that thought while I lay a little more foundation.
Encoding in Base64: E-mail is one of the earliest Internet tools. It hearkens back to an era when only the most basic alphabets could be transmitted using a venerable character encoding standard called ASCII (pronounced ASK-KEY and short for American Standard Code for Information Interchange). How do you get binary data like photos to transit a system that only understands a 128-character alphabet? Easy! You convert the binary numbers to numbers expressed more efficiently as 64 ASCII characters, to wit, the 26 lowercase letters of the alphabet, the 26 uppercase letters, numbers zero through nine and two punctuation marks (forward slash/ and plus sign+). That’s 64 characters, each representing a unique numeric value that can replace six bits of binary data. So, 24 bits of data can be written using just four base64 characters. Base64 looks like this:
Looking at our conversion events when metadata might be lost, we have:
- HEIC to JPEG
- JPEG to Base64
- Base64 to JPEG
Coding in and out of Base64 shouldn’t change a thing, but we can’t rule out anything yet.
Is that all? Nope!
Photos often change without acquiring a new format. If you’ve attached a photo to an e-mail and were asked whether you want the attachment to be small, medium, large or original size, any choice but the last one effects big changes to content. Perhaps scaling a photo poses a risk that embedded EXIF metadata will be lost?
When the defense sought the missing metadata, the legal assistant went to the plaintiff, who supplied a screenshot showing that the HEIC photos he’d sent went out carrying the full complement of EXIF metadata. I asked the legal assistant for a copy of what she’d produced to the defendant and confirmed the embedded EXIF data was, in fact, gone, gone, gone.
Coming back to “did Apple strip out the EXIF metadata from the HEIC photos when it converted them to JPEGs?” I took an HEIC photo with my iPhone and e-mailed it to my Gmail account as an attachment. The attachment was converted to a JPG but retained its EXIF data when saved to disk. I re-sent it as a downscaled image and all the EXIF remained intact. Finally, I sent it as an inline image and saved the received image to disk. Poof! The metadata vanishes! Now, we’re getting somewhere.
I asked the legal assistant to forward a copy of the e-mail she’d received from the client transmitting the photos. As expected, the photos weren’t in HEIC format but had been converted to JPEGs. Notably, they were inline photos displayed in the body of the e-mail instead of as attachments. When I saved the inline images to disk, the EXIF data was gone.
Undeterred, I saved the forwarded message to disk as an .eml message and opened it in Microsoft Notepad. Scrolling down to check the Base64 encoded content, I copied the Base64 of a single image and converted it to a JPEG photo. Happily, the photo I recovered held its full complement of EXIF data. I could only conclude that saving an inline photo to disk by right clicking and choosing “Save Image as” was the culprit. Had the photos been made attachments instead of inline images, their EXIF data would have remained in the file saved to disk.
But the revelation was that the EXIF data sought was present in the JPEG images, even if it couldn’t be pulled out by clicking on them as inline images and saving the image to disk. This was true in both Gmail and Outlook.
Now, I have a forensics lab thrumming with workstations and ingenious software, but what’s a legal assistant supposed to do, MacGyver-like, with just the tools at hand? Having solved the puzzle of what went wrong, the bonus puzzle was figuring out how to fix it.
Here’s a simple workaround I came up with that performed splendidly:
1. Create an empty folder on your Windows Desktop called “Inline Images.”
2. In Microsoft Outlook, open the message holding the inline photos you want to extract.
3. From the Outlook message menu bar select File>Save As then chose Save as Type>HTML (*.htm, *.html) and save the message to your “Inline Images” folder.
4. Open the “Inline Images” folder and locate the subfolder named [subject of the transmitting message]_Files. Open this folder and you’ll find copies of each inline photo. If you find two copies of each, small and large, the small copy is a thumbnail lacking EXIF data but the full-size version will have all EXIF metadata intact. Voila! We go from The Metadata Vanishes to Return of the Metadata.
I’d prefer clients e-mail photos by transmitting them inside a compressed Zip file rather than forwarding them as inline images or attachments. The Zip container better protects the integrity of the evidence and forestalls stripping or alteration of metadata. Plus, a Zip container can be encrypted for superior cybersecurity.
Have you run into this before, Dear Reader? Do you know a simpler way to get inline images out of parent messages without corrupting metadata or hiring an expert? If so, please leave a comment.
davidkeithtobin said:
good stuff
LikeLike
Kyle said:
I feel like this part should be in bold “I’d prefer clients e-mail photos by transmitting them inside a compressed Zip file rather than forwarding them as inline images or attachments” because this is the real lesson here, and law firms of all sizes persist in doing it despite repeated requests not to. And not just photos but other types of data too.
On another somewhat related note, I also learned recently that most social media sites automatically strip exif metadata from photos posted there, so beware of needing to preserve originals if necessary. I’m sure you’ve covered this in another blog post somewhere 😉
LikeLike
craigball said:
Yes, the image displayed on the social networking site is stripped of the EXIF but, at least in the case of Facebook, the EXIF remains in the photo when recovered by the user’s download (“takeout”) of the user’s Facebook account. That is, Facebook keeps it and gives it back.
LikeLike
Michael Dew said:
Very informative, thanks, I had never saved an email in html format before, but it seems like a good thing to know!
LikeLike
Pingback: Week 47 – 2020 – This Week In 4n6
Pingback: Forensic Focus Legal Update December 2020: Refining Search & Seizure; New Laws & Guidance - Forensic Focus
Michael Dew said:
Craig, thanks for showing the method of extracting photos by saving as html. I tell clients to not send me photos that way, but sometimes they come that way regardless and apart from the metadata issue it is much quicker to move photos to a folder using your method than it is to right click and save each one from the email.
Somewhat related, do you know if “date taken” metadata for photographs gets sent with photographs that are sent by text message? I did some experiments of my own (Android to Android) and it seems the “date taken” is not attached, but I wonder if I am wrong and just not using the correct methods on the recipient phone? Any tips you could provide on this would be much appreciated!
LikeLike
craigball said:
I can’t speak to every messaging app but I did a quick-and-dirty test using the iOS messaging app and the geolocation and camera data remained with the texted image. Upon receipt, in the mesaging app, I could see the photo on a map showing where it was taken. I then e-mailed the full-sized image and saved the attachment to disk on a machine running Windows. The geolocation and camera data could still be seen in the image Properties>Details. Accordingly, I tend to think that the data remains, at least in the iOS environment, because it did when I tested it. I can’t speak to Android because I don’t have an Android phone or account at hand to use for testing.
Remember that the data must be in the photo at the start. Not everyone enables geolocation data support for their cell phone photos and some applications–notably Facebook–strip the data. I should add that certain types of images aren’t “photos” for purposes of holding geolocation data. For example, my screenshots don’t contain geolocation data. So, I wouldn’t expect screenshots of text messages to have embedded geocoordinates.
LikeLike
Pingback: An eDiscovery and eDisclosure round-up post with some compliance thoughts on Boris Johnson | eDisclosure Information Project
The Photo Investigator said:
Very interesting. I make a photo metadata app for iPhone: The Photo Investigator. So I was reading up about random metadata things and enjoyed your article. Cheers, Danny
LikeLike
craigball said:
Nice app! I need to share it with my law students for use in their geolocation-themed workbook exercise tracking Ill-gotten gains around the world.
LikeLike
Pingback: Amber Heard Photo Metadata and the Depp-Heard Trial