After the great feedback we got last year for the first Data Platform Discovery day we’re going to be doing another one on September 21st 2021.
We’re still going to have great intro sessions to Data Platform topics help you kickstart your Data Platform career or just so you know what your techies are talking about.
But this year we’re also going to be having Panel discussions between the sessions. We’ll be picking a range of people to discuss Data Platform topics, different ways to start you Data Platform career, and also how people can make the Data Platform community and workplaces better spaces for everyone
Also this year we’re going to be running a single day that goes from 10:00 till 19:00 UTC. This allows us to cover a wide range of Timezones and gives audiences a chance to maybe see speakers they wouldn’t normally get to.
Our Call for Speakers is now open! So if you have some great intro/level 100 sessions you feel would help a someone starting out then we’d love for you to submit. Do you have something you wish someone would have told you when you first started? Then submit it here – https://sessionize.com/data-platform-discovery-day-2021/
If you’d like to take part in the panels, then please read through the descriptions: DPDD 2021 panel descriptions. We’re looking for anyone who can offer insights or commentary on any of these topics. If that sounds like you then please submit a ‘session’ with the title of the panel your interested in, and use the abstract box to tell us a bit about yourself and why you’d be great on our panel
Call for Speakers closes on the 31st August and we will be notifying everyone successful or unsuccessful as soon as we can.
So you might have noticed a break even longer than my usual disappearances. So where have I been and have I done anything interesting?
Like everyone else Covid-19 has had a major impact on my job. I work for a large Higher Education provider. The UK lockdown meant we suddenly had ~4k staff members and ~35k students working elsewhere than our Campus. This caused any number of issues and problems to solve.
It’s certainly been an ‘interesting’ year and I’ve learnt a lot about networking and how the organisation actually operates. Lots of performance issues have become apparent when they aren’t being masked by a good fast network.
That’s eaten up a lot of my time, and meant I’ve not really fancied spending extra time in the ‘office’/backroom as it feels a little too like staying late. Thanks to getting more efficient and some improved workflows I should get more time to do things outside of the job.
At the start of 2021 I moved into a new role. I’m now the Techncial Lead for an Operations team. Still doing a lot of SQL DBA stuff, but the role is now focused more on setting out our future trajectory for a wider range of technologies.
A lot of the role is looking forward and creating strategy, so there’s been some dark corners of our infrastructure I’ve had to poke around into and then come up with plans to bring things up to scratch or convince the owners that we’ve just going to have to shut it down.
I’ve found myself moving back toward the *nix world with the job shuffle. There’s a lot more Linux in use as it makes Azure cheaper, and our developers are moving towards .Net Core so Linux hosting makes sense. So I’ve been spending more time at the bash prompt than the pwsh prompt, sometimes ending up with an unholy marriage to get things done.
There’s been quite a steep learning curve catching up with some more obscure bits that I’d never looked too deeply into before.
Still plenty of SQL Server in the mix, but certainly less than 12 months ago.
My world has become a lot more text based during lockdown. What would have been a quick chat on the way to grab a coffee is now a Teams chat.
At the start of lockdown I was using an Apple Magic Keyboard, but soon found it wasn’t doing my fingers any favours. This led me down the mech keyboard rabbit hole. After a couple of iterations I’m now on a home built Kyria split keyboard using the Colemak Mod-DH layout.
I’ll probably do post on this in a bit as I found thinking a bit more about my main user interface an interesting experience
We got a dog. Say hello to Marx:
He’s an energetic little Cocker Spaniel, he’s got a lot to learn so I’ve been getting into a lot of dog training theory and practice for the last couple of months getting him ready for some adventures once the weather improves
He’s named for this Marx:
Not this Marx:
All that means I’m going to be starting to blog a bit more regularly, but that the topics might change a bit. Not hugely, but there are posts about how to build dog training placeboards and what dog transport crate fits into an F40 BMW 1 series in the pipeline 😉
In the comments here I was asked about using Restore-DbaDatabase when all you have is blobs in an Azure Storage account. This can be done, involves a couple of non dbatools steps.
Restore-DbaDatabase doesn’t natively ‘talk’ Azure, nor do any of the other dbatools commands. The reason for this is that we didn’t want to force dbatools users to have to install the AzureRM/Az powershell modules just to use our module. We’ve gone to a lot of effort to make sure that dbatools is acceptable to Security Admins and that it has a small(ish) footprint, and adding those large modules as prerequisites would have broken that.
In this example I’m going to be using the Az module as that’s the current recommend Azure module, and the one I’ve got to hand.
To follow through this you’re going to need to have already set up the restore target instance to talk to the Azure Blob Storage. Full instructions for that are here:
We start by getting the storage account details, here they’re in a storage account called sqlbackups which lives in a resource group named sqlbackups.
Grab all the blobs in the specified container. We do the filtering in the next step as Get-AzStorageBlob doesn’t have any filtering other than -Prefix which is of almost zero use in this case
Here we’re just filtering for all the blobs touched in the last 24 hours. You can filter for other things, but remember that there are no ‘folders’ below the containers, just /’s in the name. So if you’re trying to filter for something like Ola’s paths for a known database full backup you’ll need to use something like this (using master as an example)
As a nice little lockdown project I decided to up our home network security game a bit. And it was a good excuse to finally buy a Raspberry Pi. I’ve fancied one for years, but always worried it’d end up buried in a drawer not getting used.
I’ve been using Steven Black’s blocking scripts (https://github.com/StevenBlack/hosts)on my Mac for ages now, and they do cut down the amount of annoying ads and other distractions that cover the internet these days. Seemed rude to not have the same protection running on all the hosts in the house. I am just running the lists that block know advertising sites and malicious site. There are also lists to block adults sites, gambling sites and others if you want to restrict what your network can be used for.
But maintaining the lists across multiple machines would be a complete pain. The iOS devices don’t let you do it for a start, and some machines I’m not allowed to play with, like C’s work laptop.
Enter Pi-Hole which is basically a DNS server that implements Steven’s host lists at a network level. A DNS server is what turns Domain Names like stuart-moore.com into the IP Addresses that the internet actually runs on, which in this case would be 220.127.116.11
Pi-Hole works by stepping in between your hosts and the normal DNS, and if a request is made to a ‘bad’ domain then it will answer the request with the address 0.0.0.0 rather than forwarding it on for the real answer. Your browser will ignore this address and it’ll be like your request just fell into a black hole
It has a very simple, almost automated (almost!), install which will get you up and running pretty quickly. Steven’s various host lists can be selected, and you can add you own or from other sources as well
Installation and Hardware
There’s a lot of good PiHole installation documents out there so I’m not going to repeat them here. The one I used was the official one here is the one I followed
Hardware wise I ordered myself a Raspberry Pi 4 2GB kit from The Pi Hut. Came with almost everything I needed, the only extra bits I added were a heatsink and a Cat 5 cable as shamefully I don’t have any lying around the house now it’s all Wifi
2GB seemed plenty enough to run PiHole and some VPN stuff I wanted to use (more details on that in another post), and it saved a few sheckles.
The install went really easily and soon I have my own DNS server running on a fixed IP address. During the install I’d selected to have all external DNS queries relayed to Cloudflare’s DNS servers (18.104.22.168) via an encrypted channel, so no chance of them being sniffed. An extra layer of security there.
Now to make it really useful I had to get that address out to all of the devices on the network. This is where PlusNet comes in
Configuring it with PlusNet
We have a nice simple router that came with out PlusNet fibre. Years ago I used to have custom firmware and things, but having a bit of an SLA with C we went to something simple and easy that I wasn’t likely to break at an inopportune moment.
Unfortunately the PlusNet router doesn’t give you very much control over the networking settings. To make configuring all the devices (and any new ones) easy I wanted to push the new DNS IP address out via DHCP so they’d pick it up automatically.
While the PlusNet router gives you a few options to alter a very small number of configuration values, the DNS server address isn’t one of them. Turns out this has been a know issue with these PlusNet routers for a while, with a long running thread on the PlusNet forums complaining about it
So that option was out, but…..
The solution to the PlusNet problem
Luckily PiHole comes with a solution to this. It has it’s very own DHCP server built in! So all I had to do was to configure that. It’s very simple to do.
Once you installed Pi-Hole it provides you with a nice Web admin interface to configure it with. Under Settings you can find the DHCP opion:
All you need to provide is the range of IPs you want you devices to be on. Keeping it on the 192.168.1.x range makes it easier with the PlusNet router as they’ll all be on the same subnet.
2 things to note in my screen shot:
A 1 hour lease is really short. I’ve set mine like that while I’m experimenting
The PlusNet router gateway is 192.168.254, use that one. Mine is different due to my VPN setup
You can only have 1 DHCP server on a network or the clients get confused over which one to get their configuaration. Now that we have an alternative to the PlusNet DHCP server we need to disable DHCP on the PlusNet router
To disable it login to you PlusNet router by going to http://192.168.1.254 Click on the ‘Advanced Settings’ button Login using the password from the back of your PlusNet router Continue to Advanced Settings Select ‘Home Network’ Select ‘IP Addresses’
You should now be at a screen that look like this:
Tick No next to ‘DHCP Server Enable’. Then click on apply at the bottom right, confirm you’re sure, and you’re all done.
Your little Raspberry Pi and PiHole are now controlling your DNS and your DHCP settings for everything on your home network.
PiHole so far
I’ve only had mine in for a little less than a week, and we appear to be blocking about 8-10% of DNS queries atm. Which is a solid amount of web queries that we’re not having to transfer over the network, so things are feeling a bit speedier than before.
Also seeing fewer pop ups coming through, and some sites are much more readable without the inserted images and videos getting in the way.
Looking at the load averages on my little Pi, it seems to be coping with all this with it’s feet up. So there’s plenty of room to add some more bits and pieces onto it. I’ve already added an outbound VPN connection for our outgoing traffic and there’ll be more coming along as well
Adding the Pi-Hole was a simple and easy way to block out annoying internet adverts and potential malicious sites for every device on our network. If we had children then I’d definitely be using some of the stricter lists to keep an lid on what they could stumble across.
It would have been simpler if the PlusNet router allowed you to alter a few more settings, but disabling it only takes a few clicks through simple admin interfaces to sort out.
I’m very proud to have been selected to be a Microsoft Data Platform MVP for the 3rd year running. Until you’ve finally ‘signed’ the NDA you’re not sure if you’re going to be re-confirmed or if there’s been an admin mistake.
It’s quite a bit of waiting from the end of March to now to find out if you’re activities have been deemed worthy of renewal. No one’s quite sure what you have to do to become and MVP, which is good because it allows Microsoft to reward different skill sets or pick people who support the community in different ways, which just adds to the suspense as you don’t even know what or how much they’re looking for.
I gave myself a little talk the night before, about how I’d spent the year doing all of the things I normally do in the community and enjoying myself whilst doing them. So if I wasn’t reconfirmed then I’d lost absolutely nothing. My technical skills were going to be exactly the same and the talks I present would be just the same in terms of quality and content.
I think if you’re just doing community events to ‘rack up points’, you’re missing quite a big point. If you’re only doing something to try to become (or keep) being an MVP then it’ll probably be noticed, but not in the way you want to be.
Having run events before and after becoming an MVP I can tell you that it’s not the though of ‘MVP points’ that keep you going when things breakdown, it’s the wanting to provide something fun and interesting for people in your community.
All of the MVPs I know and have met, are doing this because it’s fun. This makes working with them or learning from them fun. Someone who’s just ‘grinding’ away trying to level up will not be as much fun to learn from or cooperate with, and people will notice.
So if you’re a fellow recertified MVP, well done for keeping on doing what you’ve been doing so well. Hopefully we’ll start bumping into each other again once life is less virtual
Or if you fancy becoming an MVP one day, then get into things because they’re fun and things may happen along the way. If you don’t become an MVP, then at least you’ll have had a lot of fun along the way and met a lot of great people who are part of the technical community.
So this year I’m just going to keep on doing what I do each year and fingers crossed, this time in 2021 I’ll be looking at MVP number 4. If not, I’ll still have a grin thinking about everything I did and more knowledge than I do today, so nothing to be upset about there.
No, this isn’t an in depth discussion of various *nix filesystem internals. This is about the old fashioned idea of keeping a written journal on a regular basis.
I’ve been journaling for most of 2020 so far, 158 days of entries so you can work out the miss/hit ratio yourself 🙂
I started on the first of January, wanting to capture what I actually do and achieve over a year. And also as a way to record things like mood, sleep and my running. Not because of any great historical legacy, but just because I wanted to see how things correlated and see if I could improve things
I started off as cheaply and easily as I could, with a £1 reporters notebook and the first pen I could find with ink in it.
The first entries were very perfunctory, and usually just a single side of writing. Picking a random day in February I seem to have been concerned about Project ‘X’ at work, the chance of getting rained on during the next day’s run to the office, and if I needed a smaller notebook for my ToDo list so I was more likely to carry it around.
So, no Earth shattering revelations or Zen like wisdom seeping out 38 days in then.
Getting into it
It really has come into it’s own as we’ve been locked down for COVID. Looking back now I can see we started out personal lock down on 14th Mach (we’re so cool, we started early) as that was the first mention of C’s illness. In that way it’s been good to keep track of that happens when time and life seem to go very slow or almost stop.
Not being in the office, and having to rely on the magic that is Teams to talk anything through it’s not always possible to gauge what’s going on or what the end game is, and to simmering over insignificant issues.
Often, just writing something down is enough to see a solution or take some weight off. I’d quite often find it wasn’t as big an issue as I expected, or that it’s something I need to flag up for a more senior answer. Not having anyone looking over my shoulder I’m freer to write down the first answer that comes to mind, and then to negate that in the next sentence as I spot why it was a stupid idea. Gradually inching my way to a solution
The longer I’ve been journaling, the longer my entries have become and the easier I’ve found to just put down anything that comes to mind. When I started there was a bit of hesitancy that you’re ‘oversharing’, but the realisation creeps in that you’re only sharing what you already know with yourself.
Occasionally I’ll throw in a prompt to make me think about something. Nothing to deep, so I stick to the 3 standards that all Journaling sites recommend:
What are you grateful for today
What do you want to achieve tomorrow
What went well today
I tend to throw these in if the entry seems to be a bit of a downer. Just because I’ve been struggling with a SQL Server bug for hours, it doesn’t mean that the day was a complete write off, so this forces me to think back and look for something good in the day. Or if the day has been a hell storm of despair, I can acknowledge that but try to give myself something to look forward to tomorrow.
It’s also a great way to spend 15-20 minutes concentrating on something that doesn’t involve either your work or home life. You might be writing about them, but the action is completely separate. I tend to finish the kitchen chores in the evening and then spend the time before I head up to bed sitting at the kitchen table writing it out. I find it leaves me calmer and a bit more peaceful as I head upstairs.
You don’t need to handwrite it! I’m sure, like myself, and a lot of people reading this don’t write long hand much outside of meetings. I wanted to improve my handwriting this year, so writing things out in long had to make sure I practiced regularly. I also wanted to make sure I wasn’t distracted by any incoming alerts or the ‘Oh, I could fix that quickly’ thoughts. If you want to just write a daily .txt file in your Journal folder, that’s great as well.
To me the real benefit just comes from the doing, not how you do it. My journal is pretty plain, maybe a horizontal line between entries, but that’s about it, but it works for me. If you prefer some drawing, or more colour a la Bullet Journals then that’s great as well
I don’t beat myself up if I miss a night. Real Life crops up a lot, if we’ve been out then it might too late by the time we get back. Or if I’ve been out for a couple of pints my handwriting goes from bad to unreadable pretty quickly. I’ll try not to miss more than one night on the trot though. With Covid we’ve not been anywhere, but I’d take my journal with me in future.
Like all habits, once you’re on the treadmill and have a routine then it becomes second nature. I look forward to writing each evening now, and even treated myself to my first fountain pen since High School 28 years ago, some quality ink and a nice Lamy notebook
Now I have the routine, I’m not sure if I’d want to stop really. It has been good to flick back to see what’s happening or remind myself that things are changing. And there will always been something to write about
Give it a go
So, go find an old notebook and a biro, or fire up your text editor of choice, and give it a go for a couple of weeks, you might find up you enjoy it and keep on going for a lot longer.
I love my todo lists. Writing down everything I need to do stops me worrying that I’ve forgotten something. I even put on todo items to remind me to strop working and take a break or do some exercise.
But, just like every other IT worker, something new will crop up. Your Boss wants some information on a spreadsheet, an urgent all hands Teams call crops up or that misbehaving app on your machine that takes an hour to fix.
There things pop up and throw your entire day out of whack, and your To Do list looks more and more like a wish list than a plan. After a few days of this it can start to get really disheartening as you fall further and further back from where you want to be. I was finding this more and more as I’ve settled into working from home during lock down
So now my To Do list also has a ‘Done It’ section. Anything that comes in and has to be done that day gets added to the list, but prefixed with ‘+’. This means I have an easy way to see what has come in and bumped things off of my list.
If it doesn’t need to be actioned the same day the it goes onto my planner list for actioning at a more appropriate time in the future
At the end of the day, if I have tasks that I haven’t ticked off on my to do list, then I can quickly see if they were pushed off by too many incoming tasks. I also have a better list of what I HAVE done, not just what I had planned to do that day.
After a couple of days I get a better feel for how much overhead I need to leave each day to cope with these unknown tasks that are going to come in. So I can make an allowance in how much I commit to each day to make sure I get things done. If I’m strike lucky and noting unseen come in, then I just pick more task off of my planning list to fill up the rest of the day.
Remember that your to do list should be a good indication of not only what you want to achieve on the day, but also a realistic list of what you CAN achieve that day. If you’ve listed more to do than you can, they you have a wish list not a to do list.
First off, apologies to everyone who who was following the 31 days of dbatools Backup and Restores, I will finish it off! Just need to get back into a bit more of a stable routine so I’ve got time to finish it off.
Like everyone else my life has had a big change with the current COVID lockdown in the UK. I’m pretty lucky that I can work from home and that my employer wants us to keep working so no risk of redundancy or furloughing (yet 🤞)
Not having a proper home office made the first couple of weeks quite painful, as long days at the kitchen table weren’t kind on my back. But with some luck on eBay, I’ve now got myself somewhere more ergonomic to work.
Getting used to this new way of working, and also having to pick up some new skills (today I’ve been debugging a JS app) has meant working on my setup, planning skills and the software I have available to me. So there’s going to be some posts on those topics coming up over the next couple of weeks as they’re easier to fit in around my current schedules.
Hope everyone else is doing well in lock down, and look forward to when we can start getting back to meeting up at Data Platform events.
dbatools likes to support as many users as we possibly. It’s written for DBAs who may have no choice but to run older versions of software or don’t want to have to constantly upgrade everything even if they could.
This means we support PowerShell from 3.0 upwards, and we try to go back as far as we can with SQL Server. Due to some major changes between SQL Server 7 and SQL Server 2000 (I was there, and it was a big change), we can only go back to SQL Server 2000.
The change from SQL Server 2000 to SQL Server 2005 was another massive jump in terms of the Storage Engine and other components. The one that causes dbatools’ Restore-DbaDatabase the most issues was a change to the way headers were read from backup files. It appears in SQL 2000 some extra headers were added to the write, but not to the read action.
This means that despite the information being in a SQL Server 2000 backup, you can’t read it with a SQL Server 2000 instance.
Also, the SQL Server backup history in msdb have a major rewrite at the same time. Working that into Get-DbaDbBackupHistory would make that function even harder to maintain.
Fear not, we have a solution. Get-DbaBackupInformation can read the information from the SQL Server 2000 backup headers, as long as you use a more modern version of SQL Server to read the files. It doesn’t matter which edition or version you use, so feel free to spin up whichever copy of SQL Sever Express -gt 2005 and follow along:
If you think this is going to be a regular occurence, then to speed things up you might want to have a look at some of the strategies I outlined in Day 16’s post Recreating Backup History to have this information to hand without scanning lots of files
As much as we want to support as many versions transparently as we can, there are some limits. But we still to support those who are left on SQL Server 2000 if we can.
Oops, missed a day yesterday. In my defence repairing our payment system is pretty important 🙂
You may not alway want to restore a whole database. A common piecemeal way of restoring a database is just restoring pages that have become corrupted
What is a page? All data in SQL Server is handled in units called pages. A SQL Server page is an 8kb allocation of disk or memory. Memory? Yes, it make things easier if you use the same size for both as you don’t need to convert or split things up, which will only slow things down.
When I say all data in SQL Server is store in page size chunk, I mean pretty much everything, data, indexes, binary data, the SQL Server internal data defining tables, view, stored procedures and so on. The main thing not store on pages are Transaction Log records, they have their own format
Unfortunately sometimes those pages can become corrupted. This could be through drive failures, controller failues or very rarely a bug in the application.
You may even have some corrupted pages in your databases. If the page isn’t read very often then you might not notice it’s corrupted. But at some point you will.
Don’t think, that just because you can can backup and restore a database that it’s guaranteed corruption free, it may still sneak through.
The only way to be sure is to run regular DBCC checks. Really you want to be running it nightly. Unfortunately, a DBCC check involved reading every page of data, so has an impact on your SQL Server instance’s performance. You can can offload the checks to other instances by restoring the database, but this can make tracking down the root cause of the corruption harder (Brent Ozar with a timely post on this very subject)
So what do you do when you find you’ve got corruption?. You first need to find out when the corruption occured, this is why it’s important to check regularly as you want to know ASAP not weeks after the fact.
We’re going to restore just the corrupted pages from our SQL Server backups using Restore-DbaDatabase. We need to go from the full backup BEFORE the corruptions occured, and then roll the pages forward using the transaction logs. If you catch the corruption early there’s less work for SQL Server to do to recover these, finding out a week after your last full backup can make this a lengthy process.
To find just the pages we need to restore we’re going to use another dbatools command, Get-DbaSuspectPage. This looks at the msdb.dbo.suspect_pages table, which keeps a log of all pages in the SQL Server instance which have been marked as suspect. Keeping an eye on this table can help spot corruption, but unless every page in your database gets read regularly it won’t pick up every bit of potential corrupt.
So, your luck’s run out and you’ve found you’ve a lot of corrupted data pages. Hopefully this is just after a full backup.
First we run Get-DbaSuspectPage to find out how bad it is and which pages are suspect:
I’m using Get-DbaDbBackupHistory to get the BackupHistory to speed things up. You can pass in files if you want to, but it will take a bit longer
There’s a couple of new parameters on the Restore-DbaDatabase command which are only used with this type of restore:
This lets Restore-DbaDatabase know that we’re going to be doing page restores by passing in a list of the pages than need restoring. In this case we’re passing in the output from Get-DbaSuspectPage as it’s the easiest way
As part of the recovery phase for a page restore SQL Server needs to take a tail log backup of the database and then restore it. The folder passed in via this parameter is the location that the tail backup will be put in. So make sure SQL Server can read and write to the location and that you have plenty of space to use.
As soon as the Page restore has completed, take a full backup of your database. Now if anything breaks you can start from a known good point to start from. Definitely start looking at why the corruption may have occured, sometimes it’s just plain bad luck other times it points to issues in applications or hardware, if you can find the culprit and remove it now it’ll be so much better.
SQL Sever database corruption happens, not often, but it can happen so it’s best to be prepared to jump on it as soon as it appears. So CHECKDB regularly and keep an on msdb.dbo.suspect_pages. If you start seeing lots of corruption then dig down to find out what’s causing it.