Yesterday we looked at scraping Backup information from SQL Server backup files on disk. There is another source of information available. Every backup in SQL Server is also recorded within tables in msdb. For the simple examples we’re looking at ¬†so far, we just need to consider 2 of them. These are:

  • msdb.dbo.backupset – Holds the information about every backup
  • msdb.dbo.backupmediafamily – Holds the location

There are a number of other tables that also contain information that is needed when dealing with File or Filegroup backups, or with backups that have been spanned across multiple mediasets

The only issue with using this data as opposed to the data ‘scraped’ from disk, is that most people will manage their backup files outside of SQL Server, so there’s no guarantee that the file is where SQL Server left it. It could have been arhived, compressed or just plain deleted. This is the case even if you’re using the ‘Maintenance Cleanup Task’ in a Maintenance plan to delete your old backups. Running the ‘History Cleanup Task’ will prune your backup history down to your specifications (SQL Server won’t do this unless told to, and I have seen MSDB dbs that have ballooned to a ridiculous size with years of backup records!). Because of this you may want to filter the data you retrieve from SQL Server by your retention policy, and also checking that any file you reference does in fact exist will avoid problems.

The data returned from the ReadBackupHeader and msdb.dbo.backupset is VERY similar, but with differences in naming convention. And the enumerators for BackupType | Type need a CASE statement to translate them. The basic query we’ll be running is:

select
 a.backup_finish_date as BackupFinishDate,
 a.backup_set_uuid as BackupSetGUID,
 a.backup_size as BackupSize,
 a.backup_start_date as BackupStartDate,
 a.begins_log_chain as BeginsLogChain,
 a.checkpoint_lsn as CheckpointLSN,
 a.collation_name as Collation,
 a.compatibility_level as CompatibilityLevel,
 a.compressed_backup_size as CompressBackupSize,
 a.database_backup_lsn as DatabaseBackupLSN,
 a.database_creation_date as DatabaseCreationDate,
 a.database_name as DatabaseName,
 a.database_version as DatabaseVersion,
 a.description as BackupDescritpion,
 a.differential_base_guid as DifferentialBaseGUID,
 a.differential_base_lsn as DifferentialBaseLSN,
 a.expiration_date as ExpirationDate,
 a.family_guid as FamilyGUID,
 a.first_lsn as FirstLSN,
 a.first_recovery_fork_guid as FirstRecoveryForkID,
 a.flags as Flags,
 a.fork_point_lsn as ForkPointLSN,
 a.has_backup_checksums as HasBackupChecksums,
 a.has_bulk_logged_data as HasBulkLoggedData,
 a.has_incomplete_metadata as HasIncompleteMetaData,
 a.is_copy_only as IsCopyOnly,
 a.is_damaged as IsDamaged,
 a.is_force_offline as IsForceOffline,
 a.is_password_protected,
 a.is_readonly as IsReadOnly,
 a.is_single_user as IsSingleUser,
 a.is_snapshot as IsSnapshot,
 a.last_family_number,
 a.last_lsn as LastLSN,
 a.last_media_number,
 a.machine_name as MachineName,
 a.name as BackupName,
 a.position as Position,
 a.recovery_model as RecoveryModel,
 a.server_name as ServeName,
 a.software_build_version as SoftwareVersionBuild,
 a.software_major_version as SoftwareVersionMajor,
 a.software_minor_version as SoftwareVersionMinor,
 a.sort_order as SortOrder,
 a.unicode_compare_style as UnicodeComparisonStyle,
 a.unicode_locale as UnicodeLocalID,
 a.user_name as UserName,
 case a.type
	when 'D' then 'Database'
	when 'I' then 'Differential database'
	when 'L' then 'Log'
	when 'F' then 'File or filegroup'
	when 'G' then 'Differential file'
	when 'P' then 'Partial'
	when 'Q' then 'Differential partial'
 end as BackupTypeDescription,
 case [type]
	when 'D' then 1
	when 'I' then 5
	when 'L' then 2
	when 'F' then 4
	when 'G' then 6
	when 'P' then 7
	when 'Q' then 8
 end as BackupType,
 b.physical_device_name as 'FilePath'
 from
    msdb.dbo.backupset a inner join msdb.dbo.backupmediafamily
        on a.media_set_id=b.media_set_id B

To fill our data from SQL Server we’ll create 2 functions:

  1. Get-SQLServerBackups
  2. Test-SQLServerBackupExistence

The first will retrieve a set of backup information from a specified SQL Sever instance, optionally filtering the data by a specific date or by only asking for the last x days.

Our 2nd function will take in an instance of our $Backups object, and will step through testing to see each file exists, removing those that don’t.

So with our first function:

function Get-SQLServerBackups
{param([string]$SQLServer,[int]$ExcludeDaysOlder=0,[string]$BackupNewerThan='01/01/1900')
push-location
import-module "SQLPS" -DisableNameChecking
pop-location

$sqlsvr = New-Object -TypeName  Microsoft.SQLServer.Management.Smo.Server($SQLServer)

$sql="select a.backup_finish_date as BackupFinishDate,
 a.backup_set_uuid as BackupSetGUID,
 a.backup_size as BackupSize,
 a.backup_start_date as BackupStartDate,
 a.begins_log_chain as BeginsLogChain,
 a.checkpoint_lsn as CheckpointLSN,
 a.collation_name as Collation,
 a.compatibility_level as CompatibilityLevel,
 a.compressed_backup_size as CompressBackupSize,
 a.database_backup_lsn as DatabaseBackupLSN,
 a.database_creation_date as DatabaseCreationDate,
 a.database_name as DatabaseName,
 a.database_version as DatabaseVersion,
 a.description as BackupDescritpion,
 a.differential_base_guid as DifferentialBaseGUID,
 a.differential_base_lsn as DifferentialBaseLSN,
 a.expiration_date as ExpirationDate,
 a.family_guid as FamilyGUID,
 a.first_lsn as FirstLSN,
 a.first_recovery_fork_guid as FirstRecoveryForkID,
 a.flags as Flags,
 a.fork_point_lsn as ForkPointLSN,
 a.has_backup_checksums as HasBackupChecksums,
 a.has_bulk_logged_data as HasBulkLoggedData,
 a.has_incomplete_metadata as HasIncompleteMetaData,
 a.is_copy_only as IsCopyOnly,
 a.is_damaged as IsDamaged,
 a.is_force_offline as IsForceOffline,
 a.is_password_protected,
 a.is_readonly as IsReadOnly,
 a.is_single_user as IsSingleUser,
 a.is_snapshot as IsSnapshot,
 a.last_family_number,
 a.last_lsn as LastLSN,
 a.last_media_number,
 a.machine_name as MachineName,
 a.name as BackupName,
 a.position as Position,
 a.recovery_model as RecoveryModel,
 a.server_name as ServeName,
 a.software_build_version as SoftwareVersionBuild,
 a.software_major_version as SoftwareVersionMajor,
 a.software_minor_version as SoftwareVersionMinor,
 a.sort_order as SortOrder,
 a.unicode_compare_style as UnicodeComparisonStyle,
 a.unicode_locale as UnicodeLocalID,
 a.user_name as UserName,
 case a.type
	when 'D' then 'Database'
	when 'I' then 'Differential database'
	when 'L' then 'Log'
	when 'F' then 'File or filegroup'
	when 'G' then 'Differential file'
	when 'P' then 'Partial'
	when 'Q' then 'Differential partial'
	end as BackupTypeDescription,
	case [type]
	when 'D' then 1
	when 'I' then 5
	when 'L' then 2
	when 'F' then 4
	when 'G' then 6
	when 'P' then 7
	when 'Q' then 8
	end as BackupType,
	b.physical_device_name as 'FilePath'
	 from msdb.dbo.backupset a inner join msdb.dbo.backupmediafamily b on a.media_set_id=b.media_set_id
where 1=1"

if ($ExcludeDaysOlder -ne 0){
    $sql = $sql+" and a.backup_start_date>dateadd(dd,-"+$ExcludeDaysOlder.tostring()+",current_timestamp)"
}

if ($BackupNewerThan -gt '01/01/1900'){
    $sql = $sql+" and a.backup_start_date>'"+$BackupNewerThan+"'"
}

$dt = Invoke-Sqlcmd -ServerInstance $SQLServer -Query $sql

return $dt
}

This is a very simple function, it just looks huge due to the SQL Statement $sql. If the parameters are supplied then we append the appropriate where filter. Then the query is sent to the SQL Server using Invoke-SQL cmdlet, this returns the data in a DataTable which we then return.

And then our second function:

function Test-SQLServerBackupExistence
{param ($BackupArray)

Set-location c:\

foreach ($row in $BackupArray.filepath | sort -Unique){
    if ((Test-Path $row) -ne $true){
        $BackupArray = $BackupArray | Where-Object {$_.FilePath -ne $row}
     }
}
return $BackupArray
}

This takes in an array as a parameter. By piping the $BackupArray.FilePath property through Sort -unique we only need to check each file once. If it doesn’t exist then we remove all ‘rows’ with that FilePath from the array. We then return the remains of the array.

The one drawback with this method is that if the same file is being used repeatedly for backups, then all those backups will stay in the array as we are only checking the existence of the file not seeing what it actually contains.

To load the data we’d call the functions as follows:

$Backups = Get-SQLServerBackups -SQLServer Server1 -ExcludeDaysOlder 7
$Backups = Test-SQLSeverBackupExistence $Backups

As we’ve ensured that we’ve kept the same structure for the Data.Tables returned from the disk scrape and the SQL query we can combine them easily like so:

$Backups = Get-SQLServerBackups -SQLServer Server1 -ExcludeDaysOlder 7
$Backups = Test-SQLSeverBackupExistence $Backups
$BackupFiles += Get-SQLBackupFiles -FilePath c:\psbackups2 -SQLServer "Server2"
foreach($File in $BackupFiles){
    $Backups += Get-BackupContents -FilePath $Sile -SQLServer "Server1"
}

As we’re using a standard PowerShell Data Table we can also us standard PowerShell cmdlets to save it to disk, and then reload it at a later date. This means we can scrape disks overnight during periods of low usage and then load it up during the day. Or we can update our records from SQL Server on a daily basis to speed the process up.

To save the information:

$Backups | Export-CliXML c:\backup-store.xml

To load the information:

$Backups = Import-CliXML c:\backup-store.xml

Tomorrow we’ll be looking at how we can use this new objects to ensure we restore the right files in the right order in a more robust manner than just relying on file names and last modified dates.

This post is part of a series posted between 1st September 2013 and 3rd October 2013, an index for the series is available here.