Musings of a Data professional

Stuart Moore

Month: July 2018

Generating a list of full text word breakers for SQL Server

The post “Generating a list of full text word breakers for SQL Server” was originally published for Leaf Node, but as that site is being wound down I’m republishing it here.

Why do you need a list of full text word breakers? SQL Server Full Text works on breaking text down into fragments we’d normally call words, and then working on those fragments. In English we know what most of those are likely to be, ‘ ‘,’.’,’,’ (space, full stop, comma) and some others. But what about other characters? What happens with a ‘? And what about in other collations? This is why a list of full text word breakers is very useful to have

This isn’t going to be a nice handy list of full text word breakers, but it will show how to quickly get such a list.

The basic principle is going to rely on sys.dm_fts_parser. As described on the linked page this takes a string, and then breaks it down into ‘tokens’ based on the defined language, word breaker, thesaurus and stop list. As an example:

select * FROM sys.dm_fts_parser ('"the quick brown fox jumpsover"', 1033, 0, 0)

The typo is deliberate, for once!. And this returns:
Splitting full text strings into tokens in SQL Server

So it’s returned a row for every token it can find. This example was nice and simple as it only contains words and spaces. The type was to show that without a word breaker SQL Server will just return the string without interpreting it. So that was for spaces, now just to confirm for other characters. First of all commas:

select * FROM sys.dm_fts_parser ('"quick,brown"', 1033, 0, 0)

Splitting full text strings with word breakers in SQL Server
So, that returns 2 rows as expected. How about hyphens?:

select * FROM sys.dm_fts_parser ('"quick-brown"', 1033, 0, 0)

 Interesting, we get 3 rows this time. This makes sense when you think that hyphenation isn’t an exact science so some people will use them, and some won’t. So by combining both ‘sides’ of the hyphenated word SQL Server can hopefully match both uses.

So, that’s the basic theory. So this little piece of T-SQL is going to loop through all the 255 ASCII characters. For each one we’re going to use it to join 2 ‘words’, and then run that string through sys.dm_fts_parser. If the function returns more than 1 row we now that it’s found a word breaker, so we then output the character, and the character code as not all the characters are printable. You’ll also notice that code 34 throws an error, that’s because it’s ” which is a reserved character within full Text searches.

declare @i integer
declare @cnt integer
set @i=0
while @i<255
begin
  set @cnt=0
  select @cnt=COUNT(1) FROM sys.dm_fts_parser ('"word1'+CHAR(@i)+'word2"', 1033, 0, 0)
  if @cnt>1
  begin
    print 'this char - '+char(@i)+' - char('+convert(varchar(3),@i)+') is a word breaker'
  end
  set @i=@i+1
end

Which gives a nice long list:

this char -  - char(1) is a word breaker
this char -  - char(2) is a word breaker
this char -  - char(3) is a word breaker
this char -  - char(4) is a word breaker
this char -  - char(5) is a word breaker
this char -  - char(6) is a word breaker
this char -  - char(7) is a word breaker
this char -  - char(8) is a word breaker
this char - 	 - char(9) is a word breaker
this char - 
 - char(10) is a word breaker
this char -  - char(11) is a word breaker
this char -  - char(12) is a word breaker
this char - 
 - char(13) is a word breaker
this char -  - char(14) is a word breaker
this char -  - char(15) is a word breaker
this char -  - char(16) is a word breaker
this char -  - char(17) is a word breaker
this char -  - char(18) is a word breaker
this char -  - char(19) is a word breaker
this char -  - char(20) is a word breaker
this char -  - char(21) is a word breaker
this char -  - char(22) is a word breaker
this char -  - char(23) is a word breaker
this char -  - char(24) is a word breaker
this char -  - char(25) is a word breaker
this char -  - char(26) is a word breaker
this char -  - char(27) is a word breaker
this char -  - char(28) is a word breaker
this char -  - char(29) is a word breaker
this char -  - char(30) is a word breaker
this char -  - char(31) is a word breaker
this char -   - char(32) is a word breaker
this char - ! - char(33) is a word breaker
Msg 7630, Level 15, State 3, Line 7
Syntax error near 'word2' in the full-text search condition '"word1"word2"'.
this char - # - char(35) is a word breaker
this char - $ - char(36) is a word breaker
this char - % - char(37) is a word breaker
this char - & - char(38) is a word breaker
this char - ( - char(40) is a word breaker
this char - ) - char(41) is a word breaker
this char - * - char(42) is a word breaker
this char - + - char(43) is a word breaker
this char - , - char(44) is a word breaker
this char - - - char(45) is a word breaker
this char - . - char(46) is a word breaker
this char - / - char(47) is a word breaker
this char - : - char(58) is a word breaker
this char - ; - char(59) is a word breaker
this char - < - char(60) is a word breaker
this char - = - char(61) is a word breaker
this char - > - char(62) is a word breaker
this char - ? - char(63) is a word breaker
this char - @ - char(64) is a word breaker
this char - [ - char(91) is a word breaker
this char - \ - char(92) is a word breaker
this char - ] - char(93) is a word breaker
this char - ^ - char(94) is a word breaker
this char - { - char(123) is a word breaker
this char - | - char(124) is a word breaker
this char - } - char(125) is a word breaker
this char - ~ - char(126) is a word breaker
this char -  - char(127) is a word breaker
this char -  - char(129) is a word breaker
this char - ‚ - char(130) is a word breaker
this char - „ - char(132) is a word breaker
this char - … - char(133) is a word breaker
this char - † - char(134) is a word breaker
this char - ‡ - char(135) is a word breaker
this char - ‰ - char(137) is a word breaker
this char - ‹ - char(139) is a word breaker
this char -  - char(141) is a word breaker
this char -  - char(143) is a word breaker
this char -  - char(144) is a word breaker
this char - “ - char(147) is a word breaker
this char - ” - char(148) is a word breaker
this char - • - char(149) is a word breaker
this char - – - char(150) is a word breaker
this char - — - char(151) is a word breaker
this char - › - char(155) is a word breaker
this char -  - char(157) is a word breaker
this char -   - char(160) is a word breaker
this char - ¡ - char(161) is a word breaker
this char - ¦ - char(166) is a word breaker
this char - § - char(167) is a word breaker
this char - « - char(171) is a word breaker
this char - ­ - char(173) is a word breaker
this char - ± - char(177) is a word breaker
this char - ¶ - char(182) is a word breaker
this char - · - char(183) is a word breaker
this char - » - char(187) is a word breaker
this char - ¿ - char(191) is a word breaker
this char - × - char(215) is a word breaker
this char - ÷ - char(247) is a word breaker

As you can see there’s a lot of word breakers in there. And you can generate an even larger set if you use unicode, though this will need to loop through a lot more that 255 records, and you’ll need to use nchar() instead of char().

Get your Azure Subscription ID from the Azure portal

Normally getting an Azure Subscription ID isn’t a problem for me. Normally I’m sat at a machine with the PowerShell ARM modules already loaded or the Azure-Cli a docker spin up away. But not this time, just the raw Azure Portal. Despite it trying to hide them from you it’s quite easy to find the Azure Subscription IDs once you know where to look.

Login to the Azure Portal as normal. Then start typing Subscriptions into the search bar:

Click on the subscription service and a nice list of all your available Azure Subscriptions along with their IDs will pop up:

Just copy and paste the Azure Subscription ID you want and you’re done. Hope that’s useful

Rolling Back SQL Server agent job owner changes

So in this post I showed you a technique for updating the owner on all your SQL Agent jobs.

Whilst this isn’t a major change, you do need to have checked that the new ownership account has the correct permissions or some thing (eg; EXECUTE AS or CMDEXEC steps). With this in mind someone asked me how I’d back this out. Either for all jobs or that one specific MUST NOT FAIL job that is failing.

As we have 2 ways to change them, we have 2 ways to back out the changes. Note that both of these require you to have done a small amount of work up front, there is no magic rollback option 🙁

PowerShell and dbatools

This one is a little simpler as we can use a lot of PowerShell plumbing to do the lifting for us. To set this up we use:

Get-DbaAgentJob -SqlInstance server1 | Select-Object Name, OwnerLoginName | ConvertTo-Csv | Out-File .\server1-JobOwners.csv

If you’re not a PowerShell user I’ll just break this down into steps:

  1. We user dbatools’ `Get-DbaAgentJob` cmdlet to get all the Agent jobs on Server1
  2. We pipe the ouput to `Select-Object`, we select just the job Name and it’s Owner’s login name
  3. The output is then converted to a CSV object
  4. Finally we write the date to a file in the current folder

So we’ve created a nice csv list of the original owners of our jobs, great. Now how do we use those to roll back?
To roll back every job:

Get-Content .\server1-JobOwners.csv | ConvertFrom-Csv | Foreach-Object {Set-DbaAgentJob -SqlInstance Server1 -Job $_.Name -OwnerLoginName $_.OwnerLoginName}

To rollback just one job it’s very similar, we just insert a Where-Object step into the pipeline:

Get-Content .\server1-JobOwners.csv | ConvertFrom-Csv | Where-Object -Name -eq 'My Job' | Foreach-Object {Set-DbaAgentJob -SqlInstance Server1 -Job $_.Name -OwnerLoginName $_.OwnerLoginName}

In both cases we read in our CSV, convert it from CSV into a PowerShell Object. For a single job we use Where-Object to filter down to that one job, you could also use like if you wanted to pattern match. The remain records then get piped through a ForEach-Object loop, where we use Set-DbaAgentJob to reset the jobs owner.

T-SQL

The pre setup for this one involves creating and populating a table to store the old values:

CREATE TABLE JobOwnerHistory(
  Job_Id char(36),
  JobName nvarchar(128),
  OwnerLoginName nvarchar(50)
)

INSERT INTO JobOwnerHistory
  SELECT 
    sj.job_id, 
    sj.JobName,
    sp.name as 'OwnerLoginName' 
  FROM 
    msdb..sysjobs sj 
    INNER JOIN sys.server_principals sp on sj.owner_sid=sp.sid

So now, resetting a job’s owner is just a modification of our original script:

DECLARE @job_id char(36)
DECLARE @OwnerLoginName varchar(50)
DECLARE JobCursor CURSOR
FOR
SELECT 
  job_id,
  OwnerLoginName
FROM
  JobOwnerHistory
--WHERE JobName LIKE '*Finance*'
OPEN JobCursor 
FETCH NEXT FROM JobCursor INTO @job_id, @OwnerLoginName
WHILE (@@FETCH_STATUS <> -1)
BEGIN
exec msdb..sp_update_job
    @job_name = @job_id,
    @owner_login_name = @OwnerLoginName
FETCH NEXT FROM JobCursor INTO @job_id, @OwnerLoginName
END 
CLOSE JobCursor 
DEALLOCATE JobCursor 

As written that will reset every job in JobOwnerHistory, if you want to filter down to a subset of tables you’d uncomment and modify the WHERE line

Hope those examples are helpful. If you’ve stumbled across this page and it doesn’t quite fix your problem, please drop a comment and I’ll try to lend a hand.

T-SQL Tuesday 104 – Code they won’t pry out of my hands

T-SQL TuesdayIt’s the 2nd Tuesday so time for a T-SQL Tuesday post. This month’s host Bert Wagner (b | t) posed the following topic for us:

For this month’s T-SQL Tuesday, I want you to write about code you’ve written that you would hate to live without

Off on slight tangent I’m actually going to write about the life of some code you’d never wrench out of my hands, just to show how learning to scratch a simple itch can lead to learning a lot more and getting involved with larger projects. And mainly because we didn’t do source control properly back in those days so the early code is lost to history (thankfully as it wasn’t pretty!)

About 6 years ago I needed to implement some SQL Server restore testing to keep our corporate auditors happy. I ran around with T-Sql for a while, but trying to find a solution that worked with all of our systems was a nightmare. So I dug out the PowerShell documentation and started reading up on SMO (that page looks a lot nicer than it did back in 2013). Soon I had a working piece of code, that with enough baling twine and candle wax could do what we wanted.

By now I’d got lots of code examples, so decided to turn them into 31 days worth of blog posts – 31 Days of SQL Server Backup and Restores with PowerShell to help others work out where to go and save them running into the same loops and holes I’d done. Also wrote a presentation and took it around the UK to Usergroups, SQL Saturdays and conferences.

Now we move onto the bit where I can start showing off some actual code

Roll on forward to 2016 and I start to update my scripts to get them a little more up to date. In the process I decided to transform them from a ragtag collection of scripts into a shiny PowerShell model, and so SqlAutoRestores came into being. The code in github is very much a work in progress. The basics worked but a lot of supporting stuff to cope with other people’s infrastructure was still needed.

Luckily I was asked to help with the dbatools project around then, mainly with the restore code. And the rest is history

So in 5 years my scrappy little bit of code has moved from this:

import-module "SQLPS" -DisableNameChecking
 
$sqlsvr = New-Object -TypeName  Microsoft.SQLServer.Management.Smo.Server("Server1")
$restore = New-Object -TypeName Microsoft.SqlServer.Management.Smo.Restore
$devicetype = [Microsoft.SqlServer.Management.Smo.DeviceType]::File
 
$backupname = "C:\psbackups\psrestore.bak"
 
$restoredevice = New-Object -TypeName Microsoft.SQLServer.Management.Smo.BackupDeviceItem($backupname,$devicetype)
 
$restore.Database = "psrestore"
$restore.ReplaceDatabase = $True
$restore.Devices.add($restoredevice)
$restore.sqlrestore($sqlsvr)

(Basic PS Restore)

to these:
(linking to source as these are a tad larger than earlier files)
Restore-DbaDatabase (697 lines)
Get-DbaBackupInformation (354 lines)
Select-DbaBackupInformation (172 lines)
Test-DbaBackupInformation (209 lines)
Invoke-DbaAdvancedRestore (367 lines)

All of which have been worked on by many hands. But I now have a restore solution that I use every day in my day job to reset development environments and restore check my backups still, and is in use around the world by DBAs who need a solid reliable restore solution (If you’ve done a migration with dbatools, guess what’s underneath that)

It’s probably my most used single piece of code, and not just because it’s running 24/7/365 testing! The muscle memory is now so ingrained that I can beat almost any other restore option to start.

The point of that story is what this T-SQL Tuesday is about, if you shout about some nifty code, or at least let someone know, there’s a good chance other people will want to use it or help you develop it. It won’t happen overnight, this story is about 5.5 years old and there’s still debates about where this code is going to go in the future. And it’s been great to see where this tiny project has eventually lead me with my career and community involvement over the years.

Granting Azure Roles to Users or groups without a SignInName in Azure AD

Whilst sorting some permissions out on a resource group I stumbled across an issue with assigning them to a group. We have a replicate On Premise and Azure Active Directory solution in place. In the On Premise AD not all groups are Distribution groups so don’t have a ‘group1@contoso.com’ sign in. This causes a problem when trying to assign a role in the usual PowerShell way:

New-AzureRmRoleAssignment -SignInName 'group1@contoso.com' -RoleDefinitionName Contributor -ResourceGroupName MyRg

or

New-AzureRmRoleAssignment -SignInName 'group1' -RoleDefinitionName Contributor -ResourceGroupName MyRg

both return a nice red error message:

'New-AzureRmRoleAssignment : The provided information does not map to an AD object id.
At line:1 char:1
+ New-AzureRmRoleAssignment -SignInName group1@contoso.com -RoleD ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : CloseError: (:) [New-AzureRmRoleAssignment], KeyNotFoundException
    + FullyQualifiedErrorId : Microsoft.Azure.Commands.Resources.NewAzureRoleAssignmentCommand'

 

The workaround for this is to use the ObjectID parameter, and to grab the ObjectId like this:

$objectid = (Get-AzureRmADGroup -DisplayName "Our Developers").Id
New-AzureRmRoleAssignment -ObjectId $objectid -RoleDefinitionName Contributor -ResourceGroupName MyRg

Just another one of those little oddities that crop up when you’ve got a long established infrastructure that doesn’t quite fit the modern templates 😉

Now with added MVP!

Massively honoured that Microsoft have awarded me their Data Platform MVP award for 2018.

Looking at the other 19 UK Data Platform MVPs I’ve joining a pretty elevated set of individuals. Many of them were the people I looked up to and got advice from when I first started dabbling into the UK SQL Server community, so it’s a reflection on how much advice and knowledge there is out there that I’ve gotten this. And that I’ve got some living up to do.

The MVP is a acknowledgement of a person’s engagement with the community. I’ve done lots over the years:

But the really good stuff has been getting to meet people at all the events. There’s so much to be learnt over that quick 10 minute coffee break or at the after event shindig. Those are the conversations that have led me down interesting tech or made me consider speaking, and then lead me into organising as well.

So Thank You to everyone who’s talked to me online or at an event, and if you haven’t then feel free to bend my ear at the next event I’m at.

Powered by WordPress & Theme by Anders Norén