skip to content
The DFIR Journal

File Carving: Encrypted Virtual Hard Disks

/ 11 min read

Introduction

Welcome to a forensic nightmare. An organisation is facing a ransomware incident. They have discovered that virtual hard disks, which contain all key servers for business operations, have been encrypted. Onsite backups are impacted and unable to support recovery. The organisation maintains offsite backups, yet these backups are over a month old. While the business can return to operations once restoration is complete, a significant forensic gap exists.

A significant development in ransomware has been encryption of virtual hard disks. This approach immediately impacts not only the victim organisation but also the investigation and availability of evidence sources. The evidence exists, hovering just out of reach… well, technically encrypted. We’re left with a major question: Is there a way to obtain at least some data out of these encrypted disks?

Opportunity Presents

Ransomware binaries are designed for speed, they’ll target sections of large files leaving a surprising amount of data partially accessible. The developers of ransomware binaries have to balance the strength of encryption with the speed to encrypt. Encrypting a large file such as a virtual hard disk entirely can be resource and time-intensive. Therefore the Threat Actors are presented with the risk of the ransomware binary execution being detected and therefore stopped preventing further encryption. Instead, the Threat Actors have opted to perform a faster form of encryption, hindering the use of the file over complete encryption.

Modern ransomware makes it incredibly difficult to recover / decrypt files without paying for a decryptor. The binaries operate in a unique manner requiring knowledge of the encryption process and therefore significant reverse engineering is required. File carving offers a potential recovery path to reconstruct evidence even where the full set of data can not be recovered. In the upcoming sections of this article, we will explore potential avenues of recovery based on my recent research.

Introduction to File Carving

File carving is a data recovery technique that involves reconstructing files directly from raw data. This becomes particularly interesting when dealing with encrypted files where standard recovery methods may be ineffective.

At its core, file carving relies on the predictable structure of file formats, specifically file signatures which provide a reliable method for identifying file boundaries within raw data. For example, a JPEG image typically starts with the bytes FF D8 and ends with FF D9. By identifying these signatures within raw data we can identify and extract the data effectively carving out the original file. I recommend checking out GCK’s File Signatures Table (https://www.garykessler.net/library/file_sigs.html).

File carving is not without its limitations. Successful carving relies on files being stored contiguously on the disk. Additionally, file carving based on signatures alone may not always yield complete files, particularly for larger files such as database files. Despite these challenges, file carving remains a valuable technique especially when considering it may provide an opportunity to understand an incident where previously impossible. It’s worth noting that while there are many automated file carving tools available, manual file carving is sometimes necessary. While time-consuming, manual carving allows for a more thorough and targeted approach, which can be helpful when dealing with less common file types or fragmented files.

The Investigation Scenario

To investigate the various potential recovery options, we can use the following scenario to structure our investigation and testing.

Scenario

An organisation is facing a ransomware incident. They have discovered that virtual hard disks, which contain all key servers for business operations, have been encrypted. Onsite backups are impacted and unable to support recovery. The organisation maintains offsite backups, yet these backups are over a month old. While the business can return to operations once restoration is complete, a significant forensic gap exists.

In the background to simulate the scenario, I’ve completed a few common techniques observed by Threat Actors on a virtual machine. Once complete, I’ve run a well-known ransomware binary against the virtual hard disk file. Obviously, this is a crafted scenario to reflect a potential investigation and aid in the creation of this blog post - it is not a perfect representation by any means.

Investigation Objective

The objectives of our investigation and testing include:

  • Identify whether a known PowerShell Script (discovery.ps1) was present on the disk.
  • Identify the contents of a text known to contain the string “malicious file”.
  • Identify any malicious web searches performed by the Threat Actor.
  • Carve any files of interest.
  • Extract Windows Event Logs to process in tools such as EvtxECmd and Hayabusa.

Converting VHDX to DD

To make the process smoother we will convert in this case the VHD file to DD which is supported by various tools. The following command can be used:

Terminal window
dd if=encrypted_disk.vhd of=encrypted_disk.dd bs=512

Understanding the Recovery Opportunity

Before we dive into the various recovery methods we need to establish whether file carving data from an encrypted virtual hard disk is even possible. We can establish this using a hex editor (my current preference is Imhex - https://github.com/WerWolv/ImHex) and search for common file signatures to determine the extent of impact.

By searching for FF D8 (JPEG Signature) we identify selections of data are still intact. Further, we can confirm the presence of Windows Event Log files by searching for 45 6C 66 46 69 6C 65 00. As shown below this confirms our initial suspicion that the recovery may be possible despite the disk’s encrypted state. Additionally, comparing the encrypted disk with the original disk file, we can see the disk definitely has had some changes mostly which appear at the beginning of the file.

Hex Editor Search for EVTX

Considerations and Restrictions

There are a couple of things to keep in mind - the key is that success varies and is not guaranteed. As a general rule the larger the file, the greater the likelihood of successful carving. This is due to the way the ransomware binaries seek to encrypt elements of the files rather than perform complete file encryption especially, for large files such as virtual hard disks. However, noting that file size alone does not guarantee successful recovery. The type of encryption used by the ransomware can also have a significant impact. Some encryption algorithms operate in a mode that preserves the structure of the original file, while others may use techniques like padding or obfuscation that make carving more difficult. The success of file carving often depends on the specific ransomware strain and methods of ransomware involved.

It’s also worth keeping in mind that file carving is a time and resource-intensive process, particularly when dealing with large disk images. Depending on the size of the disk and the number of files to be carved, the process could take hours or even days. This is where having a clear investigative strategy and prioritising the most critical evidence items becomes essential.

File Recovery Methods

The below walkthrough will highlight the need for various tools with different areas in which they shine. Throughout this article we will focus on freely available tools, noting there are commercial tools available such as X-Ways and Magnet. A successful investigation requires multiple approaches to maximize the recovery rates whilst validating the data to ensure the integrity of the investigation.

Grep

Grep is very handy especially when the keyword you are searching for is known. With Grep, you can also include parameters such as including a set amount of lines before and after the match as included in the example below of five lines. Through the earlier stages of the investigation, we were able to identify a PowerShell script used by the Threat Actor named “Discovery.ps1”. Using Grep we can search for “Discovery.ps1” to determine whether it was present on the disk and therefore potentially used by the Threat Actor. As shown below a match is produced with the script located under the Administrator’s Desktop folder.

Example Grep Command:

Terminal window
grep -a -C 5 "Discovery.ps1" encrypted_disk.dd
Grep Discovery Search Result

Hex Editor

A hex editor provides the ability to view the raw data. Using a hex editor we can search for specific strings to confirm their presence and the potential for extraction. In this example searching for the contents of a text file known to contain the string “malicious file”. As shown below a match is produced with the contents and above the name of the text file as “Malicious File.txt”.

Hex Editor Text Content Search

Bulk Extractor

Bulk Extractor takes a different approach, focusing on data patterns to find traces of files and artifacts. This is beneficial when searching for information such as email addresses or URLs. Bulk Extractor includes a GUI tool known as Bulk Extractor Viewer (BEViewer) which is useful for visualizing the data. In the example below we were able to identify a URL search potentially performed by a Threat Actor for “mimikatz”. This further leads towards indications that the “mimikatz.exe” executable may have been downloaded by the Threat Actor.

Example Bulk Extractor Command:

Terminal window
bulk_extractor -o bulkextractor_output encrypted_disk.dd
BEViewer Mimikatz

PhotoRec

Photorec works to carve files out of disk images based on known file signatures, it is excellent for recovering common file formats. Additionally, Photorec supports the ability to extract specific selected file types. This is where the conversion to dd format starts to come in. Within the drop down select media then select add raw disk image. Within the file formats section, we can also select the various file formats we are after.

The recovered files do not include file names to identify files hashing or string search against the top-level directory is the common approach. In the example below we can see PhotoRec has been able to recover 4,830 exe files. Within the recovered exe files was the original mimikatz executable file.

PhotoRec PhotoRec Results

Scalpel

Scalpel is a command line file carving tool which uses a configuration file to search for header and footer signatures and extract the matching files. Scalpel is built off another well-known command line file carving tool known as foremost. Gary Kessler’s site also contains a scalpel configuration file which includes the file signatures of the File Signatures Table. Using Gary Kessler’s configuration file you can build out a configuration file to extract desired file types. Targeting PDF and EVTX file using Scalpel against our scenario encrypted disk we were able to obtain a PDF document which was placed on the Administrator’s desktop and a subset of the EVTX files.

Example Scalpel Command:

Terminal window
scalpel -c ./scalpel.conf -o ./scalpel_output ./encrypted_disk.dd

Scalpel GitHub Repository: https://github.com/sleuthkit/scalpel Garry Kessler’s File Signatures: https://www.garykessler.net/software/index.html#filesigs

EVTXtract

One of the critical evidence items during an investigation you are most likely after is Windows Event Logs. EVTXtract constructs EVTX records from corrupt event logs or even in our case disk images. EVTXtract uses the known EVTX structure to extract records matching the patterns and provides the output in XML format. This is particularly useful when looking for a specific record or event log. When run against our encrypted disk it successfully extracts a subset of Event Logs. Unfortunately, the extract does not contain the full set of Event Logs however, it may just contain the events you are after. Especially interesting for an older tool which was not originally intended for this purpose I imagine.

GitHub Repository: https://github.com/williballenthin/EVTXtract

evtxtract.exe C:\location\encrypted_disk.dd > C:\location\evtx.xml
EVTXtract Results

Custom EVTXparser

One of the limitations of EVTXtract is the output of XML format. The downside to this is a lot of tooling such as EVTXCmd and Hayabusa works off the input of EVTX format. With the introduction of AI such as GPT and Claude, we can test out code with custom scripts for file carving. Obviously, it goes without saying that this code should be validated and tested. With a few prompts, I was able to get a working script that would extract a subset of EVTX files. Whilst most were corrupt and more than likely not complete in its entirety it does show that this is possible. Hayabusa and EVTXcmd appear to work with corrupt EVTX files and we were able to quickly parse and get the records in a format we are used to.

Final Thoughts

This high-level analysis confirms that data recovery from encrypted virtual hard disks is feasible in some capacity. This is heavily influenced by the ransomware encryption process favouring rapid file corruption/encryption over full encryption. This is partially exciting as during an investigation this may be able to provide evidence of the activities of a Threat Actor from a disk which is in an encrypted state. Noting this post has only covered a few of the many ways available for performing file carving and analysis.

As ransomware continues to develop our investigation processes and methods as always will need to adapt. I am hopeful that further research into forensic pathways for encrypted virtual hard disks will occur. A reminder that each case is unique and recovery success rates vary based on multiple factors. Always orient to an investigation strategy and objectives throughout the whole process.

The investigative process is not a one-pass process rather it should be iterative involving pivoting as required based on findings.

My Thoughts from Investigating Windows Systems.

Additional Resources