Friday, February 6, 2015

File History – Deconstructing the Catalog part 3

File History –
Deconstructing the Catalog part 3



Overview

In the previous two installments on Deconstructing the Catalog I discussed the importance of the artifacts that are left by FileHistory as well as the Schema and table descriptions of the Catalog#.edb. In this section I will cover how to put all this together and identify what files persisted on a system that was backed up and how long they were on the system until they were removed from the backup location.

 

Opening the Table


To view the Catalog#.edb an analyst will have to have a utility that understands the edb format. The two tools I have has consistent success with are the EseDbViewer from Woanware and the ESEDatabaseView from Nir Soft. Both tools have their strengths and weakness. The EseDbViewer has soon compatibility issues when installed on Windows 8 and Windows 10, but works perfect from a thumbdrive, it also is the only tool that allows exporting all the tables at once. The ESEDatabaseview consistently works from both local installation and from a thumbdrive, but the tables must be exported individually. When using these tools I prefer the EseDbViewer layout over the ESEDatabaseview.



When looking at the Catalog#.edb database in the EseDbViewer the left frame shows each table found in the database as well as the number of objects in the database in (). In the image above my string table has 91 entries. In order to identify how many columns a table has, that table needs to be opened in the view panel.



When looking at the Catalog#.edb database in the EseDatabaseView the drop down shows the current table selected, the table ID and the number of columns in the table.  In order to find out the number of entries in the table, that table has to be opened and the results are displayed in the lower left corner.  In the image above my string table has 91 entries.
Each record entry in a table is a unique folderpath or file name, some files such as desktop.ini shares the same string entry for name, but will have a different filepath associated.

For analysis an examiner usually will focus on the following tables:

Table Name
Analysis Purpose
String
Plain text value of the filepath or name associated with the ID.
Namespace
Shows what was backed up, contains the MAC information of the file, when it was present during the backup process and the USN Journal entry for that file.
Backupset
Each ID is related to when a Backup file is ran or scheduled to run and the timestamp on when this happened.
File
Will contain ID’s of where the file is located, and the filehistory backup process where the file was present, these values are both found in the backupset table. This table will also contain the file size in bytes.




The namespace table will include the ChildID, ParentID which when referenced to the string table the user friendly name of the filepath and file, as well as the file size when referenced to the file table. The fileCreated and fileModified times are based on the original fil MAC times, and not when it was backed up. The tCreated, tModified and tVisible values are referenced to the backupset table to find out when the file was first and last seen during the process and when it was modified. An analyst could see a tCreated value of 1, tModified value of 3 and a tVisible value of 6. Those tValues would represent that the file was last backed up in the initial filehistory backup process, last modified between the 2nd and 3rd backup process and removed from backup between the 5th and 6th backup process.



To understand the values in the backupset table a new entry is created each time filehistory backup is scheduled to run, regardless if anything was backed up or if the backup location was present. If the default backup schedule is every hour there will be an entry for every hour. This allows for cross reference in the namespace table the tCreated, tModified and tVisible values.



To understand the values in the file table an analyst will need to jump between the related tables for this information. For example the first entry the ChildID is 20 and ParentID is 19. When cross referenced with the String table an analyst will be able to identify that the ParentID references the following file path: C:\Users\Kenneth\Favorites\Links while the ChildID represents the desktop.ini that is found in the parent directory. Looking back at the file table an analyst will be able to decipher that all ChildID’s of 20 represents the desktop.ini file.

The major drawback of using the EseDbViewer and ESEDatabaseView is that when switching among tables the initial view will always be the top of the database. When working with a massive dataset it is very easily too got lost on where you left off and it can be time consuming for an analyst.

Using Filehistory_Deconstructor


The Filehistory_Deconstructor tool to say the least is designed for one purpose, it’s not necessarily a pretty solution, but it will give the analyst a tool to make the analysis of the catalog#.edb more efficient. This tool would not have been possible without a little help from @theC0dem0nkey. The FileHistory_Deconstructor.xlsx can be found here.

This tool is built using Microsoft Excel with plenty of macros and behind the scene coding. So without further delay, let me show you how you can do this analysis in a more efficient manner.
  1.       Open Catalog#.edb in your favorite tool that allows you to extract each table.
    1.  Extract the tables
  2.       Open Filehistory_Deconstructor.xlsx
    1.    Make sure to Enable Content
  3.       Open each extracted table csv and copy to contents into the appropriate tab.
    1.   Future enhancement will import for you.
  4.      Once all tabs are populated, return to the “Instruction” tab and click the “Analyse FileHistory Data” button.
  5.      Let the macros run their magic and then navigate to the “Results” Tab.
  6.      You will now be able to quickly identify key values of the catalog#.edb.
    1.  The “Results” tab will automatically hide system files such as desktop.ini.
  7.      You can now save the file for your investigation.

   The “CleanUp” button will clear all values that were originally added for analysis so you have a clean FileHistory_Deconstructor.xlsx for your next analysis.

This is how your Tabs should look if successfully imported:
File Tab

Namespace Tab


String Tab


And finally you can quickly gather all this information in the “Results” tab.



We can now easily see that the file FR12-023.pdf was backed-up from the ?UP\Documents directory. If you remember from previous posts, the ?UP stands for userprofile of the user that this Filehistory is related to. The USN Journal entry for this file is 30852872. Based on the file MAC times,this file was created on September 10, 2014 at 14:13:58 UTC, with a last modified time of September 10, 2014 at 14:13:58 UTC. This file was first seen in the backup process that ran on January 20, 2015 at 18:13:27 UTC and the #N/A in the Last FileHistory Instance Seen Time indicates that the file is still present in that location.

Conclusion

Understanding the Filehistory catalog#.edb artifacts will allow an analyst the ability to pinpoint when a file was present in a backup location if filehistory was on. If there is a value in the Last Instance seen it does not indicate that the file was deleted, it could have been moved to a location that is not currently in the filehistory library to backup.

Please let me know if you have any questions about this artifacts or if you are aware of any bugs or you have a feature request.


Thank you. 

Monday, December 15, 2014

File History, Research - Part 2 - Deconstructing the Catalog.

Welcome reader to part two of the exciting forensic research based on File History, part 1 can be found here. In Part 1 I touched on the Config#.xml files that will give the analysts the ability to identify backup locations, libraries that were backed up, frequency of backup, as well as other artifacts that will help build the story out. 

In this section of the research I will break down the catalog#.edb database. This section will look at each table within the database, the columns that are used as well as a description of the purpose of the columns. This section will go into some detail on the relationships of the tables, which tables will provide the most value to the analyst. 

1.1.         Catalog#.edb Analysis

The catalog#.edb databases are created when file history is initially turned on. There are two databases that are created, these databases are called catalog1.edb and catalog2.edb. The catalog#.edb databases contain multiple tables that store information based on when files are backed-up, removed from the host system, file size, MAC attributes per file and other artifacts of forensic importance.

1.1.1.      Individual Table Analysis

The catalog#.edb databases contains seven table in the database. Each table serves an important function within the file history process. These tables with a short description are as follows:

Table Name
Table Description
MySysObjects
This table is the Schema for the catalog#.edb database.
Backupset
This table tracks the backup session number and a timestamp on when the backup process occurred.
File
This table tracks information related to individual files that are backed up. This table will show the last known file path on the host system, the file size in bytes and when it was first backed up and updated in the file history process.
Global
This table holds the global configuration settings for the file history process. This includes information on the last successful backup, target directory size, and other values.
Library
This table tracks information related to individual libraries that are backed up. This table includes which file history session the Library was first and last seen in the backup process.
Namespace
This table tracks information related to the file and library backup location. The namespace table will include reference to the file name, the file path, the file attribute metadata, the file MAC times for created and updated, and the USN Journal value for the file.
Strings
The strings table track the individual file paths and file names for the back-up process. Strings starting with ?UP are for the user profile associated with the file history process while strings starting with ?PP are associated with the public profile.

1.1.1.1.            MySysObjects

The MySysObjects table includes the scheme for the Catalog#.edb table’s structure for all the tables in the Catalog#.edb. While there is some value contained in the MySysObjects table related to the schema there is no direct value related to user behavior or files backed up.

1.1.1.2.            Backupset

The backupset table is used to track when a backup was scheduled, not necessarily when it was run. There are two columns in this table, id and timestamp. The ID references the file history backup session and the timestamp key is used to identify when the file history backup started. These values can be cross references with the tCreated and tVisible local keys found with the other tables to identify which session a file was created and last seen in.

Column
Description
ID
Primary key, incremented each time a backup should run. If no data is needing backed up this will still increment.
timestamp
The value for this key is a second integer that is used to identify when the backup occurred.

1.1.1.3.            File

The file table consists of seven columns that contain file mapping and properties for each file backed up using file history. These columns are ID, ParentID, ChildID, state, status, filesize, tCaptured, tQueued and tUpdated.

Column
Description
ID
Primary key for this the file table.
ParentID
Foreign key that points to parent directory string in the stings table.
ChildID
Foreign key that points to the ID column string in the strings table.
State
Reflects if the record needs any actions for backup
Status
Status of the file version
fileSize
Size of the file in bytes
tCaptured
File History backup cycle that file is first identified in.
tQueued
File History backup cycle that file is first queued in. If this value is different that the tCaptured value it means that the backup location was offline at the time File History was ran.
tUpdated
File History backup cycle that file is first updated in. If this value is different that the tCaptured value it means that the backup location was offline at the time File History was ran.

1.1.1.4.            Global

The Global table consists of three columns that contain value that are related to the behavior of file history. These are ID, Key and Value columns. The ID is the unique identifier for the values in the Key column. The Key column contains the friendly name of the values being recorded. The value is an alphanumeric value related to the Key column. The table below are of the key values that are in the Global table, along with a description of their purpose. The keys named “LastReassociationBSet” and ReassociationPerformed are not found in Windows 10. The key named “LastRSet” is only present in the Global table if the file history backup location has changed. The table below represents the key name and the values associated with them.  

Key Name
Value
FirstBackupTime
This value is stored in Windows: 64 bit Hex Value - Little Endian, and represents the first time that File History was ran.
LastReassociationBSet (win8 only)
 This value is stored in decimal value and represents the last backup related to re-association. Trailing 0’s can be discarded.
EarliestConsecutiveElFailure

ConsecutiveElFailureCount

LastFullElScan
This value is stored in Windows: 64 bit Hex Value - Little Endian, and represents the last time
ReassociationPerformed (win8 only)
 Unsure of meaning.
LastRSet
This value is stored in decimal value and represents the number of times re-association to a new backup location has been done. Trailing 0’s can be discarded.
LastCatalogBackup
 This value is stored in Windows: 64 bit Hex Value - Little Endian, and represents the last time that the Catalog#.edb was backed up.
LastCatalogBackupTime
Windows: 64 bit Hex Value - Little Endian
VersionInformation
Version number of the File History database. The value displayed in both Windows 8 and Windows 10 is: 020000000200000002000000
LastBSet
This value is stored in decimal value and represents the last backup set that was completed. Trailing 0’s can be discarded.
EventListenerWatermarks
LastBackupTime
This value is stored in Windows: 64 bit Hex Value - Little Endian, and represents the last time that File History successfully backed up the data.
LastTargetPresentTime
This value is stored in Windows: 64 bit Hex Value - Little Endian, and represents the last time the backup destination was connected.
UnprotectedFileRecords
Hex string of total number of files that have been added to system since last file history has been ran. Trailing 0 can be discarded.
PartiallyProtectedFileRecords
 Hex string of total number of files that have been committed to be backed up, but due to file history location being offline these files are stored in local staging area. Trailing 0 can be discarded.
FullyProtectedFileRecords
Hex string of total number of files that have been backed up and therefore protected. Trailing 0 can be discarded.
TotalBytesInStaging
 Hex String related to total bytes currently stored in the local staging area.
TotalBytesOnDefaultTarget
Hex string related to total bytes currently stored on the remote backup location
OldestPresentBSet
 This value is stored in decimal value and represents the oldest backup set that is available. Trailing 0’s can be discarded.
RetentionIncomplete

LastUpdatedTargetCatalog
 This value is stored in decimal value and represents the last catalog#edb that was updated. There should only be a 1 or a 2 for values here. Trailing 0’s can be discarded.
IsDirty


1.1.1.5.            Library

The Library table consists of five columns and define the library configuration. The names of the columns are ChildID, ID, ParentID, tCreated and tVisible. The ChildID, ID and ParentID are numeric identifiers related to the Strings table. These identifiers indicate what libraries are being backed up, by default File History will back up the following libraries:
·         Contacts
·         Desktop
·         Documents
·         Favorites
·         Music
·         Pictures
·         Videos
The values stored in the tCreated represent the first backup to include the library, and the tVisible represents when the library was last included in the backup process. 

Column
Description
ChildID
Foreign key that points to child object string in the stings table.
ID
Primary Key for this table
ParentID
Foreign key that points to parent directory string in the stings table.
tCreated
Foreign key that points to the File History session in the backupset table when the directory was first included in the file history backup process.
tVisible
Foreign key that points to the File History session in the backupset table when the directory was last included in the file history process. A value of 2147483647 indicates that the folder is still being protected by file history.

1.1.1.6.            Namespace

The Namespace table consists of ten columns that are associated with unique files. These columns indicate file location, created time, USN journal entry, as well as other information.

Column Name
Description
childID
Foreign key that points to the child object string in the strings table. There may be more than one instance of a childID (e.g. desktop.ini)
fileAttrib
File attributes are metadata values stored by the file system on disk and are used by the system[1]. This value is stored in decimal. These values can be the combination of multiple metadata values. The common attributes are:
·         6 – This is a combination of value 2 and value 4. This file or directory is hidden (Value 2) and is a file or system that is part of or used exclusively by the operating system. 
·         16 – This identifies a directory
·         32 - A file or directory that is an archive file or directory. Applications typically use this attribute to mark files for backup or removal.
·         38 – This is a combination of Value 6 and 32.
fileCreated
The value for this key is a second integer that is used to identify when the file was first created. This is based on the file MAC time.
fileModified
The value for this key is a second integer that is used to identify when the file was last modified. This is based on the file MAC time.
fileRecordId
Foreign key that pints to the associated record in the file table.
id
Primary key for this table.
parentID
Foreign key that points to the parent directory string in the strings table
Status
Reflects the status of the record. This field may include:
1.      Last in line – set when a file is deleted or renamed; indicates that the next namespace record for the same name likely describes a completely different file.
2.      Directory – The record describes a directory
3.      No content change – the record describes a namespace change without a change in the file content
4.      Restored – set on namespace records describing older versions restored on the source
5.      Not active yet - indicates that the version described by the namespace record is not to be shown to the user and that the previous namespace may still be visible
6.      Unknown at this time.
tCreated
Foreign key that points to the backupset table time when the file or folder was backed up. Used to determine the period of time when a version of a file or folder was visible in a namespace.
tVisible
Used to determine the period of time when a version of a file or folder was visible in a namespace. This field may be set to a certain value to indicate that the record is within the current namespace. This indicates that the object associated with the record is considered visible in the namespace.
USN
This is the USN journal entry related to this file and is used to identify what files have changed. [2]

1.1.1.7.            Strings

The Strings table consists of two columns, these columns are called ID and String. The ID column is populated with a unique reference number that begins at 1 and is incremented with every directory and file path that is backed up. The String column is either a file path or a file name depending on what it is intended to reference. Some strings start with a ?UP\ or ?PP\, these values are related to the User Profile path (?UP) and the Public Profile path (?PP).

Column
Description
Version Present
ID
Primary key, but is also used across other tables to indicate Parent or child relationships.
Windows 8, Windows 8.1, Windows 10
String
File path, directory or file name of object.
Windows 8, Windows 8.1, Windows 10


[2] http://arstechnica.com/information-technology/2012/07/a-step-back-in-time-with-windows-8s-file-history/



Part 3 will be coming soon, I am currently working on a tool to help streamline the analysis of the catalog#.edb. If you have access to a Windows 8 or Windows 10 system and would be willing to provide me a copy of your catalog#.edb database for testing you can email me at: 
patories @ gmail<.>com