Sticky labels

Azure Synapse Analytics Dedicated SQL Pools – that’s a whole load of fun right there. Practically limitless compute and data, yumyumyum…

Apart from all the Data Science things that Azure Synapse enables, those lovely people at Microsoft have added a feature that on the face of it looks pretty innocuous but will, I promise you, prove to be very useful down the line. Especially when it comes to tracing who or what did what and how. That feature is called ‘labels’, and it’s available in Dedicated and Serverless SQL Pools (according to the documentation – I’ve only ever used it in Dedicated)

Underwhelmed ? I thought so. Let’s walk through this together…

We’re all familiar with the standard OPTION keyword that’s available across all supported SQL Server versions (on-prem and cloud) that lets you supply a query hint to the database engine (this example yoinked straight off the internet):

USE AdventureWorks2012;
GO
SELECT ProductID, OrderQty, SUM(LineTotal) AS Total
FROM Sales.SalesOrderDetail
WHERE UnitPrice < $5.00
GROUP BY ProductID, OrderQty
ORDER BY ProductID, OrderQty
OPTION (HASH GROUP, FAST 10);
GO

So far so yawn, but for Azure Synapse and PDW, there’s an additional :

SELECT * FROM sys.tables
OPTION (LABEL = 'My Query Label');

Also a bit yawn. But here’s where it gets interesting. You see, adding a label to a query adds that label to a DMV, so you can:

SELECT * FROM sys.dm_pdw_exec_requests r
WHERE r.[label] = 'My Query Label';

Note, this is sys.dm_pdw_exec_requests. It’s not in the normal sys.dm_exec_requests that we all know and love. Here’s hoping they add it to lower versions of SQL Server soon 🙂

This means that (if there’s a will and a coding standard to adhere to) it is possible to label executions and identify them right in the database engine.

For example, you might want to identify each step of an ELT process that’s done in the database so you can identify performance of each step ? Simples – OPTION (LABEL = 'FunkyELT'); or even OPTION (LABEL = 'FunkyELTStep1'); (followed by a separate lable for each step / statement).

Maybe you want to track who’s doing what ? Again, OPTION (LABEL = 'DaveDidThis');

Or maybe refer back to the Jira tickets that brought the statement into being (or tuned it when it turned out to have a sub-optimal construct). Duh – OPTION (LABEL = 'JiraTicketRef');

Tracing them back is now as easy as:

SELECT * FROM sys.dm_pdw_exec_requests r
WHERE r.[label] = 'YourLabelHere';

OK, OK, so it’s not a silver bullet, killer app, or anything – but as someone who’s always interested in what’s going on under the hood, a small tweak that helps you group stuff or uniquely identify a specific workload is *very* welcome !

Back soon…