Stuart Moore

Musings of a Data professional

Stuart Moore

Category: sql server Page 2 of 5

Day 10 – Simple Restores – 31 days of dbatools backup and restores

Simple Restores

Finally we’re going to start using Restore-DbaDatabase after talking about it for a couple of days.

Given all the options that Restore-DbaDatabase offers, we’ll start off simple and build up the complexity of the commands over the next couple of days. We’ll also cover what’s going under the hood of the command, if you’ve got a basic understanding of that a lot of other options make more sense

At the basic level what does a Restore-DbaDatabase run do during a simple restore:

  1. Connect to a SQL Server Instance
  2. Scan 1 or more backup file
  3. Work out whan can be restored from these files
  4. Begin restoring the database(s) to the SQL Server instance
  5. Continue restores until it runs out of backups to restore
  6. Open the database

There’s a fair amount going on under the hood, even for a ‘simple’ restore. Let’s work through those as we build up our first Restore-DbaDatabase command.

Read More

Day 9 – Limitations of Restore-DbaDatabase – 31 days of backup and restores

Limitations of Restore-DbaDatabase

Like all tools, Restore-DbaDatabase isn’t able to do everything that everyone wants it to. Certainly, at the moment I’d like it to write it’s own blog posts and fetch me a cold beer, but that doesn’t happen

A lot of the below isn’t complaining about people asking for features. If we can do it, we will, and we’re keen to make this work for as many people in as many situations as possible

But quite a few requests over the years have been non starters for a number of reasons.

Read More

Day 8 – Why is Restore-DbaDatabase like it is? – 31 days of dbatools backup and restores

Why is Restore-DbaDatabase like it is?

At first glance Restore-DbaDatabase looks like a slow lumberig complex beast. In reality it’s not that bad.

It’s the result of design decisions I took in wanting a solid versatile command that could cope with everything that people would want from it.

In this post, we’ll go through the main decisions/points of contention one by one

Read More

Day 7 – Backup up to Azure – 31 days of dbatools backup and restore

Day 7 – Backing up to Azure

Backing up to Azure? Why would you want to do that if you’re not running Azure SQL Database/Managed Instance?

There’s a couple of reasons, which you may or may not agree with:

  • Reliable storage
  • Off site storage
  • Scalable storage
  • Cheap(ish) storage

All of those of course are debatable depdending on your circumstances. But if you’re starting from scratch and don’t already have a on premises storage infrastructure then it’s quicker to start there and build up. And if you’re running SQL Server inside an Azure VM do you want to be piping data back down to the home network or paying for another VM to act as a backup target?

The main reason I like Azure storage when I can use it is number 2 – Offsite storage.

My backups will survive a major incident, say a failed Data Centre or a major fire. OK, those could be mediated with a secondary Data Centre, but that’s an extra level of expense that your organisation might not be able to justify.

And there are disasters that may affect all storage in your organisation. Ransomware being the one that springs to mind. If your SQL Server backups get encrypted by someone other than yourself, then they are worthless. You have NO way of recovering without paying the money. It has happened!

Companies who think they have good security have found that ransomware can still propogate onto their backup servers as the software is written to exploit any weakness in NTLM or SMB it can fine. However, I’ve yet head of one that can break into SQL Server and get hold of a credential which is what would need to happen for it to even know about SQL Server backups stored in Azure.

And if you’re really paranoid (you’re probably a DBA, so there’s a good chance) you can even use an Azure Function to move all new backups to another Azure Storage account when they finish, so then theres no tracable link anywhere on premises.

Setting up Azure for SQL Server Backups

Now, I’m going to cheat here a little point you at a post I’ve already made that covers setting up Azure Blob storage for SQL Server Backups

If you want to use the Azure Portal – Creating Azure Blob Storage Account for SQL Server Backup and Restore, using the Azure Portal

If you want to use PowerShell – Creating Azure Blob Storage Account for SQL Server Backup and Restore, using PowerShell

As I say in the above posts, I’d really really recommend you go with Shared Access Signatures. Better security granularity, easier to work with and they’re the future.

Once you’ve created your Azure Blob storage account and the SQL Server credential, you can use Backup-DbaDatabase to backup to Azure.

When backing up to Azure you are limited to a subset of parameters, this purely because Azure Blob storage won’t support things like MaxTransferSize. The allowed paramters are:

  • CopyOnly
  • Type
  • CompressBackup
  • Checksum
  • Verify
  • AzureCredential
  • ReplaceInName
  • CreateFolder
  • OutputScriptOnly
  • EncrytionAlgorithm
  • EncryptionCertificate

When using a Shared Access key you can also use “FileCount”

Backing up to Azure with Backup-DbaDatabase

There are 2 ways the command can be used to backup to Azure, depending on whether you’re using Access Keys or Shared Access Signatures.

With Access keys you use this format:

    Backup-DbaDatabase -SqlInstance server1\instance -Database db1 -AzureBaseUrl https://dbatoolsaz.blob.core.windows.net/azbackups/ -AzureCredential myAzureCred

Where myAzureCred is the name of the credential on server1\instance that holds the Azure Access keys. When using the older Access Keys style you cannot stripe backups, so can only provide one URL

With a Shared Access Signature you use this format:

    Backup-DbaDatabase -SqlInstance server1\instance -Database db1 -AzureBaseUrl https://dbatoolsaz.blob.core.windows.net/azbackups/

This will work as Backup-DbaDatabase will look for a credential named ‘https://dbatoolsaz.blob.core.windows.net/azbackups/’ on instance ‘server1\instance’

With Shared Access Signature credentials you can stripe files, either within the same container:

    Backup-DbaDatabase -SqlInstance server1\instance -Database db1 -AzureBaseUrl https://dbatoolsaz.blob.core.windows.net/azbackups/ -FileCount 3

or by specifying multiple AzureBaseUrl values:

    Backup-DbaDatabase -SqlInstance server1\instance -Database db1 -AzureBaseUrl https://dbatoolsaz.blob.core.windows.net/azbackups/,https://azdbatools.blob.core.windows.net/azbackups/ 

Assuming you have valid ‘https://dbatoolsaz.blob.core.windows.net/azbackups/’ and ‘https://azdbatools.blob.core.windows.net/azbackups/’ in the SQL instance performing the backup.

Conclusion

Backing up your SQL instances to Azure can be a good option if your organisation is just starting out, and by having backups off site takes some of the risks during a Disaster Recovery scenario can be minimised. Plus it’s a great way to migrate database up to Azure, in fact it’s Copy-DbaDatabase works.

For more ideas on how to use dbatools to migrate SQL Server databases to Azure have a look at the book dbatools in a month of lunches

All posts in this series can be found at 31 Days of dbatools Backup and Restores

Day 5 – Performance Tuning Backups – 31 Days of dbatools Backup and Restores

Day 5 – Performance Tuning Backups

Why does backup performance matter?

As your database gets bigger, backups take longer to complete. This makes sense, more data has to be written to disk which is going to take longer.

The reason it becomes a problem is that your organisation will have a Recovery Point Objective (RPO), which is the most data loss they’re willing to accept in the event of a disaster. Now, it might not be officially announced but I’m sure management has a number that’s lower than you think.

To hit your RPO you need to have less time than that between your backups. So let’s say your RPO is 15 minutes. This means you need a backup at LEAST every 15 minuted to hit this target. If your transaction log backups are taking 20 minutes, you’ll never make it.

Obviously we can’t reduce the amount of data being created, your boss won’t thank you for rate limiting customers!

So all we can do is to make things flow out faster. Let’s see how we do that:

Read More

Day 4 – Backup Encryption with Backup-DbaDatabase – 31 days of dbatools backup and restores

Day 4 – Backup Encryption with Backup-DbaDatabase

Now, we all take our db security very seriously (or at least I hope we do?). But a lot of people don’t think this through to their database backups

A database backup is a full copy of the database, by definition all of the data is in there

So if someone get’s their hands on it? Then they may as well have dbo on your production database, as they can just restore it on their own SQL Instance and go to town on the data.

That is, unless you’ve implemented some security around your backups

Read More

Please clean down your SQL Server backup history

From looking around at some of the Issues that get logged for dbatools, it looks like people never clear out their SQL backup history from msdb. This causes slow queries and potentially misleading backup trees to be returned.

In my (not so humble) opinion you really shouldn’t let your backup history fill up with, let’s be honest, useless junk. In this post I’ll go through why I think you should be keeping way less in there than people think you should

It’s not an audit trail

A common reason people give for keeping years worth of backup history records is that they need it to show the auditors. I’ve yet to find an auditor who wants a raw SQL database of database to look at.

The options I’ve used in the past are to use:

  • Export it to Excel or CSV on a regular basis. Easier to hand off to an auditor, compresses well and get’s it out of your database
  • Monitoring system records. You are monitoring your backups? In that case, you should have record of no failed jobs.

Please check what your auditors are actually asking for and why. Unless they’re also asking for verification and restore test results pure backup history is worthless. Because as a number of people have said over the years (in various ways):

An untested backup is just a wish that may not be granted

Me, probably

Will you ever go that far back?

So you’ve got 2 years worth of SQL Server backup history tucked away in msdb. Now, be honest, are you ever going to use it? Really, you’re going to roll a database back 6 months? Why?

That would be a hell of a lot of data to throw away to /dev/null or have to import all over again. Would there ever be a business case for doing that?

And once you’ve restored it, that leads on to…..

Once it’s restored, it’s a new database

You restore a SQL Server database and you start 2 new things:

  • A new Recovery Fork
  • A new LSN chain

All databases are running along a Recovery Fork (identified by a RecoveryForkGUID), even a new one will have one. You perform a restore, and SQL Server will generate a new Recovery Fork under the covers. Much like Ghostbusters, SQL Server won’t let you cross streams

As covered elsewhere on here, SQL Server uses LSNs (Log Sequence Numbers) to track database actions, and as backups are database actions they are also linked to LSNs. You’ll be taking the database back to an LSN in the past, and restarting the count from there. This is why SQL Server uses the recovery fork. Even if you were lucky enough to have backups starting on the same LSN (and it’s the sort of luck where I’d be asking you for the lottery numbers) you still won’t be able to use the backup.

So if all the backups are still in the backup history they’re useless, and potentially very confusing when you come to do a restore

Can you even restore it?

If you’re storing 2 years of old backup history, are you also storing the old backups as well? If you don’t have the backups to restore, again the history is worthless and might as well be deleted

I’m pretty hot on backups, but even I don’t keep things that long. Currently we’ll keep everything for 2 weeks, then we’ll get rid of Transaction Log and Differential backups. After 4 weeks we’ll only keep the Full backup from the last day of the month, and then at the end of the year we’ll only keep the last Full backup

So again, why keep the history if I can’t use it to restore most of the databases it contains?

Do you need that granularity?

As mentioned in the last reason you may end up with just a few backups kicking around that may need restoring for investigations or legal reasons. Do you need the backup history to restore those?

With tools like Restore-DbaDatabase I’d say no. While generating the restore chain from a folder full of backups may take a while, for a single database backup you’re only looking at seconds to scan the file(s) and be restoring. Probably quicker than trying to find the records in a bloated backup history table 😉

So what do I suggest

I’d suggest really looking at why you’re storing all that probably useless backup history.

Personally I consider anything more that a months worth as being too much, you (and your boss) may think that’s a bit tight. But I can’t ever see us needing to roll backup to a point in time over a month ago. Most of the times we do need to, the end of the month will be good enough.

So, to clear down the backup history, schedule sp_delete_backuphistory to run once a week like so:

use msdb
go

declare @oldest DATETIME  = Getdate()-28;
exec sp_delete_backuphistory @oldest_date=@oldest;

This will get rid of ALL backup history over 4 weeks old from ALL databases. This will work on SQL Server from 2005 upwards

Or you can enable the ‘History Cleanup Task’ in your maintenance plans.

Or if you’re using Ola’s great SQL Server maintenance scripts they’ll install a sp_delete_backuphistory job for you

So, go on. Get those deletions automated and make your life easier 🙂

Feedback

Obviously, all of the above is just my opinion and thoughts on best practice I’ve picked up over the years. You may have a need to keep years of backup history, and I’d love to hear what it is.

dbaSecurityScan update

As mentioned before, I’ve started working on dbaSecurityScan, a PowerShell module for tracking changes to a databases security model and bring the database back in to line.

Making the most of not being able to leave the house for very long I’ve been cracking along on getting the basics in place. Unlike a PowerShell module like dbatools, this one doesn’t break down nicely into small lumps so it’s likely that new features will be added in big lumps until we hit a critical mass.

So, I’ve just pushed up a nice big change that shows some of the core features working properly. In the current version we have:

  • Creating a database security config from an existing database for the following security principals
    • Users
    • Objects
    • Roles
    • Schemas
  • Test the database security config against a database
  • Take the test results and generate a list of actions that need to be undertaken to bring the database back inline
  • Run a ‘WhatIf ‘ run of the fixes to generate a list of actions to be undertaken
  • Apply the fixes to bring a database back in line, with a choice of running all actions, just those that add/grant security measures or just those that drop/revoke security measures

Some code to show how those features currently work:

On the to do list for the next couple of weeks of lockdown are:

  • Add fixing functions for roles and objects
  • Add Pipeline support
  • Add better comment based help
  • Allow comparison of config documents

Please get in touch if you’ve got any ideas for features. If you want to lend a hand then I’m happy to take Pull Request over at https://dbasecurityscan.io/.

Please also report any bugs/issues over on github as well.

dbaSecurityScan – A new PS module for SQL Server security

While doing some work on my materials for SqlBits training day I started thinking about a few of the problems with managing SQL Server permissions.

How easy it to audit them? If someone asks you the DBA exactly who has access to object A, can you tell them? How do people get access to that object, is it via a role, a schema or an explicit permission?

Is that information in an easy to read or manipulate manner?

How do you ensure that permissions persist between upgrades? I’ve certainly seen 3rd party upgrades that have reset database level permissions. Do you have a mechanism to check every permission and put them back as they were?

We’re all doing the devops these days. Our database schema is source controlled, and we’re deploying it incrementally in pipelines and testing it. But are we doing that with our database security?

So in the classic open source way, I decided to scratch my own itch by writing something. That something is dbaSecurityScan, a PowerShell module that aims to offer a solution for all of the above.

The core functionality atm allows you to export the current security state of your database, based on these 4 models:

  • Role based
  • User based
  • Schema based
  • Object based

You can also return all of them, or a combination of whatever you need.

At the time of writing, getting the security information and testing it is implemented, and you can try it out like this:

If you’ve got any suggestions for features, or want to lend a hand then please head over to dbaSecurityScan and raise an issue or pull request respectively 🙂

Snipe hunting in the SQL Server Transaction Log

Nice quiet day in the office, busily cracking through the To Do list when suddenly something pops into the ticketing system as a P1. Data has gone ‘missing’ from a corporate system, and there’s nothing in the audit trail about when it went missing or how it went missing. For how, read ‘Whodunnit’!

Now if there’s nothing in the audit trails, and noone is confessing then that doesn’t leave me with very much to go on. A slight break through is that someone clearly recalls seeing the data at 09:45 on the 3rd of Febuary, but that’s as good as it gets. We’re now on the 6th Febuary, so that’s a window of just over 3 days it could have gone missing.

Now, as any DBA knows, SQL Server comes equipped with a bit in ‘audit’ for any operation that modifies data. It’s just awkward and complex to read and search.

The name of that audit is the Transaction Log. SQL Server won’t do any data modification without it being written into the log, as SQL Server doesn’t count the transaction committed until it’s logged as it needs it for recovery after a restart.

The transaction log holds a LOT of information. Searching 3 days worth of it would be like looking for something in a hay stack. And at the moment we don’t even know if that’s a needle, a pin or an apple seed

An aside about reading Transaction Logs

SQL Server comes with 2 built in, but undocumented, functions fn_dblog and fn_dump_dblog. Undocumented means Microsoft don’t publish documentation and reserve the right to modift them without notice. But they’re a pretty open secret, and there’s pleny of info out there that

For the purposes of this post, you just to need to know this about them:

  • fn_dblog reads the current transaction log
  • fn_dump_dblog reads transaction log backups

If you’re on a SQL version lower that SQL Server 2014 or SQL Server 2012 SP2 then there is a know bug that uses up threads and can cause hanging.

Finding the time of deletion

So the first thing to do is to is to work out a time window when the data went missing. This was going to be a tedious process, so lets automate it. If we’re automating a SQL Server task, then dbatools becomes the obvious answer.

The plan is:

  1. Restore the database the last time the data was seen
  2. Roll forward in 6 hour increments until the data disappears
  3. Roll forward in 30 minute steps from the beginning of the last 6 hour lost until the data disappears
  4. Roll forward from the 30 minute start to the disappearance of the data in 5 minute increments

So we now have a 5 minute window we need to search in the transaction log backup, which is much better than 72 hours

To speed things up we’ll scan all the backup headers first and save them into an object which we can reuse through out the process. We’ll also rename the database, and change the filenames on restore. This is because we are restoring the database onto the instance it came from, the reason we have to do this will become clearer later on

We also gather some extra information about the missing rows, which I’ll explain in a moment.

# Time to start Restoring from
$startTime = get-date ('03/02/2020 09:45')

# The time in minutes between restore points
$windowSize = @('720','30','5')

# Used to track how far through the windows we are
$counterLoop = 0

# Query to test if the object has been deleted during the window
# We also gather some internal SQL Server page information we'll need later
$sqlQuery = 'select db_id(), sys.fn_PhysLocFormatter(%%physloc%%),* from dbo.Objects where ObjectID=''1'''

# Scan all the backup headers first to speed things up
$backups = Get-DbaBackupInformation -Path c:\RestoredBackps -SqlInstance MyInstance 

# Set a variable to tell restore whether to start a new restore, or to continue

$restoreContiue = $false
while ($counterLoop -lt $windowSize.count){
    $backups | Restore-DbaBackup -SqlInstance MyInstance -DatabaseName RestoreDb -ReplaceDbNameInFile -TrustDbBackupHistory -RestoreTime $startTime.AddMinutes($windowSize[$loopcount]) -Standby c:\Standby -Continue:$restoreContiue

    #Prevent the old page information from being overwritten
    $oldResult = $sqlResult

    $sqlResult = Invoke-DbaQuery -SqlInstance MyInstance -DatabaseName RestoreDb -Query $sqlQuery
    
    if ($sqlResult.count -eq 0){
        # If we get no results, we're past when the data has disappeared
        # So, move to the next loop and start a new restore
        
        $counterLoop++
        $restoreContiue  = $false
    } else {
        # If data still there, start the next increment
        
        $startTime = $startTime.AddMinutes($windowSize[$counterLoop])
        $restoreContiue = $true
    }
}
Write-Host "Data disappeared between $startTime and $($startTime.AddMinutes($windowSize[$counterLoop]-1))`n"
Write-Host "Missing data was on the following pages:`n"
$oldResult

Now we know when the data disappeared, now we just want how

Searching the Transaction Logs

SQL Server Transaction logs store a LOT of information, so querying them will return more information that you really want. Also the information in the Transaction Log is for the benefit of SQL Server, so not all of it is easily readable for us humans

For instance, when we’re looking for a specific missing row the easiest way is to search for transaction log records that modified the physical location the row existed on. While the actual row data and SQL details are in the log, they’re encoded, so getting to them is a lot of work.

We’re looking for a specific row in a specific table so we need the following 3 bits of information to identify it’s physical location:

  • Datafile ID
  • Page ID
  • Slot ID

These can be found for a specific row with the following query using the fn_PhysLocFormatter function which handily returns the physical location of a row in a table:

select db_id(),sys.fn_PhysLocFormatter(%%physloc%%) from dbo.Objects where ObjectID='1'

This will give you result like:

db_idFile:Page:Slot
5(1:232:58)

The reason we were saving this from the previous iteration of the loop is because it wouldn’t exist once the data had been deleted.

Now all we need to do is to search the transaction log. However, there’s one more small step before we can do that. In the transaction log the FileID and PageID are stored as fixed length hexadecimal values, and we’ve just pulled them out in decimal format.

I convert them with PowerShell as it’s a little easier for me to remember:

PS C:\ '{0:x4}' -f 1
0001
PS C:\ '{0:x8}' -f 336
00000150

Note how the FileID and PageID are padded out to 4 characters and 8 characters respectively. This query will return all the transactions that affected that slot on that page in that file. Depending on how busy that page is that might be a lot.

select * from fn_dblog(NULL,NULL) where [Page ID]='0001:00000150' and [Slot ID]='58'

Fiding out who and when

You’ll instantly find out just how much information there is in a transaction log record. So let’s trim down the data a bit much to a couple of things we want to know:

  • Who issued the command
  • Exactly when it was run

And also limit it to just delete records:

select [Transaction ID],[Begin Time],[Transaction SID] from fn_dblog(NULL,NULL) where [Page ID]='0001:00000150' and [Slot ID]='58' and Operation='LOP_DELETE_ROWS

If you’re lucky and this returnselect [Transaction ID],[Begin Time],[Transaction SID] from fn_dblog(NULL,NULL) where [Page ID]=’0001:00000150′ and [Slot ID]=’58’ and Operation=’LOP_DELETE_ROWSs a single row, you’ll find that only [Transaction ID] is populated as the other goodies are recorded at the Transaction wrapper level, not the statement level, so we take the Transaction ID and use that:

select [Transaction ID],[Begin Time],suser_sname([Transaction SID]) from fn_dblog(NULL,NULL) where [Transaction ID]=’0000:000003ba’

select [Transaction ID],[Begin Time],suser_sname([Transaction SID]) from fn_dblog(NULL,NULL) where [Transaction ID]='0000:000003ba'

This returns all the rows for the specified transaction in the transaction log. You’ll have 2 rows for the Transaction wrapper, the BEGIN and the COMMIT, these will be the first and last row.

The rows in between will be the actual deletion records. You might have one or more depending how much data was in the row, or if it was part of a delete than remove multiple rows.

The BEGIN transaction record contains the information you wanted:

  • [Begin Time] gives you the time the delete started
  • [Transaction SID] gives you the Security Identifier of the account that executed the transaction.

We user SUSER_SNAME to convert the SID to a username. This is the reason I mentioned that you want to do this on the same instance as the original database was on, this is to make sure the SIDs match up and you get the right name!

Conclusion

As you can see, everything you’d ever want to know about what happened in SQL server is in the transaction log. But it’s not an easy beast to work with, much better to look at better methods to save you having to do this.

Page 2 of 5

Powered by WordPress & Theme by Anders Norén