Friday, February 6, 2015

File History – Deconstructing the Catalog part 3

File History –
Deconstructing the Catalog part 3


In the previous two installments on Deconstructing the Catalog I discussed the importance of the artifacts that are left by FileHistory as well as the Schema and table descriptions of the Catalog#.edb. In this section I will cover how to put all this together and identify what files persisted on a system that was backed up and how long they were on the system until they were removed from the backup location.


Opening the Table

To view the Catalog#.edb an analyst will have to have a utility that understands the edb format. The two tools I have has consistent success with are the EseDbViewer from Woanware and the ESEDatabaseView from Nir Soft. Both tools have their strengths and weakness. The EseDbViewer has soon compatibility issues when installed on Windows 8 and Windows 10, but works perfect from a thumbdrive, it also is the only tool that allows exporting all the tables at once. The ESEDatabaseview consistently works from both local installation and from a thumbdrive, but the tables must be exported individually. When using these tools I prefer the EseDbViewer layout over the ESEDatabaseview.

When looking at the Catalog#.edb database in the EseDbViewer the left frame shows each table found in the database as well as the number of objects in the database in (). In the image above my string table has 91 entries. In order to identify how many columns a table has, that table needs to be opened in the view panel.

When looking at the Catalog#.edb database in the EseDatabaseView the drop down shows the current table selected, the table ID and the number of columns in the table.  In order to find out the number of entries in the table, that table has to be opened and the results are displayed in the lower left corner.  In the image above my string table has 91 entries.
Each record entry in a table is a unique folderpath or file name, some files such as desktop.ini shares the same string entry for name, but will have a different filepath associated.

For analysis an examiner usually will focus on the following tables:

Table Name
Analysis Purpose
Plain text value of the filepath or name associated with the ID.
Shows what was backed up, contains the MAC information of the file, when it was present during the backup process and the USN Journal entry for that file.
Each ID is related to when a Backup file is ran or scheduled to run and the timestamp on when this happened.
Will contain ID’s of where the file is located, and the filehistory backup process where the file was present, these values are both found in the backupset table. This table will also contain the file size in bytes.

The namespace table will include the ChildID, ParentID which when referenced to the string table the user friendly name of the filepath and file, as well as the file size when referenced to the file table. The fileCreated and fileModified times are based on the original fil MAC times, and not when it was backed up. The tCreated, tModified and tVisible values are referenced to the backupset table to find out when the file was first and last seen during the process and when it was modified. An analyst could see a tCreated value of 1, tModified value of 3 and a tVisible value of 6. Those tValues would represent that the file was last backed up in the initial filehistory backup process, last modified between the 2nd and 3rd backup process and removed from backup between the 5th and 6th backup process.

To understand the values in the backupset table a new entry is created each time filehistory backup is scheduled to run, regardless if anything was backed up or if the backup location was present. If the default backup schedule is every hour there will be an entry for every hour. This allows for cross reference in the namespace table the tCreated, tModified and tVisible values.

To understand the values in the file table an analyst will need to jump between the related tables for this information. For example the first entry the ChildID is 20 and ParentID is 19. When cross referenced with the String table an analyst will be able to identify that the ParentID references the following file path: C:\Users\Kenneth\Favorites\Links while the ChildID represents the desktop.ini that is found in the parent directory. Looking back at the file table an analyst will be able to decipher that all ChildID’s of 20 represents the desktop.ini file.

The major drawback of using the EseDbViewer and ESEDatabaseView is that when switching among tables the initial view will always be the top of the database. When working with a massive dataset it is very easily too got lost on where you left off and it can be time consuming for an analyst.

Using Filehistory_Deconstructor

The Filehistory_Deconstructor tool to say the least is designed for one purpose, it’s not necessarily a pretty solution, but it will give the analyst a tool to make the analysis of the catalog#.edb more efficient. This tool would not have been possible without a little help from @theC0dem0nkey. The FileHistory_Deconstructor.xlsx can be found here.

This tool is built using Microsoft Excel with plenty of macros and behind the scene coding. So without further delay, let me show you how you can do this analysis in a more efficient manner.
  1.       Open Catalog#.edb in your favorite tool that allows you to extract each table.
    1.  Extract the tables
  2.       Open Filehistory_Deconstructor.xlsx
    1.    Make sure to Enable Content
  3.       Open each extracted table csv and copy to contents into the appropriate tab.
    1.   Future enhancement will import for you.
  4.      Once all tabs are populated, return to the “Instruction” tab and click the “Analyse FileHistory Data” button.
  5.      Let the macros run their magic and then navigate to the “Results” Tab.
  6.      You will now be able to quickly identify key values of the catalog#.edb.
    1.  The “Results” tab will automatically hide system files such as desktop.ini.
  7.      You can now save the file for your investigation.

   The “CleanUp” button will clear all values that were originally added for analysis so you have a clean FileHistory_Deconstructor.xlsx for your next analysis.

This is how your Tabs should look if successfully imported:
File Tab

Namespace Tab

String Tab

And finally you can quickly gather all this information in the “Results” tab.

We can now easily see that the file FR12-023.pdf was backed-up from the ?UP\Documents directory. If you remember from previous posts, the ?UP stands for userprofile of the user that this Filehistory is related to. The USN Journal entry for this file is 30852872. Based on the file MAC times,this file was created on September 10, 2014 at 14:13:58 UTC, with a last modified time of September 10, 2014 at 14:13:58 UTC. This file was first seen in the backup process that ran on January 20, 2015 at 18:13:27 UTC and the #N/A in the Last FileHistory Instance Seen Time indicates that the file is still present in that location.


Understanding the Filehistory catalog#.edb artifacts will allow an analyst the ability to pinpoint when a file was present in a backup location if filehistory was on. If there is a value in the Last Instance seen it does not indicate that the file was deleted, it could have been moved to a location that is not currently in the filehistory library to backup.

Please let me know if you have any questions about this artifacts or if you are aware of any bugs or you have a feature request.

Thank you. 

No comments:

Post a Comment