Washing up

I’ve always hated washing dishes.

This comes from early childhood when, on Sundays, my sibling and I would be banished to the kitchen after the Family Dinner to perform the washing up, one parent having toiled for what seemed like the whole day to prepare the meal and the other having done nothing more arduous than read the newspaper and be older.

There are 3 specific things I hate about washing up:

  1. When there are things hidden in other things, like cutlery in a pan that has a lid on.
  2. When someone drops something into the bowl while you’re washing another.
  3. When someone adds items to the queue of things to be washed after foraging elsewhere in the house.

I guess I hate these things because they change the planned task. You know how many pots there are to wash, you plan what order you want to wash things in, and you set about executing that plan. Then you find the hidden stuff that means there’s more to wash than originally planned. And then some inconsiderate git a sibling / parent nonchalently adds to your workload without so much as a ‘by-your-leave’ that means your plan is now broken, but you still have to do the pots.

Fast forward some <mumble> years, and I’m a DBA working with transactions in SQL Server. And it seems me and SQL Server have something in common.

Imagine a workload submitted to SQL Server. An execution plan will be generated based on statistics and data *at the time the request was submitted*, and SQL Server will attempt to execute that plan.

Now, imagine that one of the elements in the workload is a table variable. SQL Server will always assume that the row count of that table is one (actually, not true, but that’s only recently and spoils my illustration). Table variables are equal to a closed lunchbox in the washing up – you only know what’s in there when you have to open it to clean it. If someone else has crammed stuff in there, that will have a knock-on effect on your ability to complete the work as planned. Same for SQL Server – processing more rows than planned can be expensive.

Imagine that workload has a lookup that uses a large table that has wildly out of date statistics. When SQL Server comes to that part, the assumptions made when the execution plan was generated will be wrong and there’s a whole lot more lifting-and-shifting to be done that was orginally planned.

Also imagine that the workload batch-processes data from a table that is constantly being added to. At the time the plan is decided on, there may be only 50,000 rows in the table, but while the batched are being fetched, a further 200,000 rows are added (yes, I know it’s a contrived and unlikely example) and those new rows will also need to be processed before the workload can complete.

With SQL Server, there are a number of things you can do to avoid this, though:

  1. Use table variables sparingly, and only when you are sure of the size and content. In my experience anything more that 1000 rows in a table variable could lead to complications, so I’d use a temporary table instead. *other experiences are available
  2. Keep your table statistics up to date. Whether you do this via UPDATE STATISTICS or whatever method, I don’t mind, but keep them up to date – your batch processes will thank you later.
  3. Be mindful of transaction management. Make sure you know how much data you’re picking up and how to ignore late-arriving data. Explicit transactions or transaction isolation levels such as RCSI might help you out, but as ever ‘It Depends’.

With washing up, there is only 1 thing you can do – get a dishwasher.

Back soon…

World Azure Synapse Day ’21

So, while there’s been all this stuff going on, a couple of data friends and I started a conference series and a user group both focused on Azure Synapse Analytics. Because, you know.. you can never have too many online conferences, training sessions, and presentations, right ?

We’ve probably all attended online lerns recently, been to meetings online, chatted with friends and family over new and sometimes unfamiliar apps like Teams and Zoom (yes, the nerds have been using them for a while, but collab apps just went *wild* recently).

But we (as in me and the aforementioned data friends) wanted to do something different. Something more…. interactive. More…. conversational. Less “I’ll sign up and download the recordings later, maybe“. And so, somewhat concietedly, we came up with ‘World Azure Synapse Day 2021’, where we can share, chat, learn, and laugh (maybe) to round out the year. A sort of retrospective on the future, if you like.

Full disclosure: we didn’t come up with the format ourselves. A few months back, a company called DataStax did something similar for the worldwide Apache Cassandra community where they had short talks from the engineering team, the sales people, their marketing. We’re aiming for something waaaayyyy less corporate, and more about you 🙂 We just borrowed the good bits and left out the sales pitch.

So we’re doing three sessions on the day, timed to hopefully coincide with people’s availability in each of APAC, EMEA, and AMER time-zones. We hope it will feel a bit more like a meeting than a presentation, as there’ll be several short talks and interviews in each. The sessions will *not* be recorded (sorry, you’re going to have to be there to see it !) – it’s really important to us that each session flows and everyone can participate, and recording that doesn’t seem fair. We’re also not even publishing an agenda, just a schedule of who will be talking at each session – we want the speakers to speak for themselves in every way possible – so there’s no pressure on anyone to ‘get it right first time’, and each and every contribution is welcome.

If you want to attend (and we hope you do), please sign up to the MeetUp for Data Toboggan here. There’s also a code of conduct under the heading ‘Be Excellent To Each Other’ that we’d ask you to adhere to, but that’s all you have to do. If you want to have your voice heard, just submit some thoughts through Sessionize, and we’ll be in contact. And don’t hesitate to ask any question you like in the MeetUp group – we’ll make sure you get a response 🙂

Hope to see you at Data Toboggan – World Azure Synapse Day 2021 ! Until then, stay safe.

The Data Toboggan Team

Month / End ?

We’ve all been there. Month End. What a nightmare ! So much data processing ! So much nervousness – ‘Will the servers hold up ? Is there enough storage ?” – that senior managers have that kinda hang-dog look on their faces and have things crossed that they didn’t even know could be crossed….

You hope, nay you pray that everything works out OK, that no-one gets called out in the middle of the night because ‘OMG it fell over !’, and that the C-Suite gets all their lovely reports on time.

And when it’s all done, everyone breathes a collective sight of relief, calms down, and forgets about all the anxienty – until next month end.

……

Batch processing has been The Way since ever there was a need for processing large amounts of data. It used to be the reason why cheques took a while to clear, and your bank balance took a day or two to update (there’s still no BACS runs on certain days, which is why your bank balance doesn’t change at weekends). Processing data in this way is great for large amounts of data. It’s a well-trodden path to gather up all your data at the end of the month and then work out what happened and how that effected your bottom line.

But maybe, just maybe, there’s a second Way.

Reporting, especially financial reporting, is essentially counting things up and (if you’re really clever) doing some other math on those numbers. Plain and simple – it’s just sums of numbers. Things that have been counted and added up.

Once something has happened, it can’t unhappen. You can’t untake a breath. You can’t uneat breakfast. Sure, there sometimes can be another event that reverses one action (eww), but essentially it remains done. The number of heartbeats you have had increases all the time (until something tragically prevents that happening). Yes, the number of beats per minute fluctuates and the number of beats per day is similarly fluid, but the counter is incremental and the variations are just counts with context. In financial transactions, money leaves an account and arrives in someone else’s. Sometimes (not frequently enough) it’s the other way round. In stocktaking, inventory items are taken and used, and replenished from somewhere else. In business, you gain some customers and you lose some customers.

All of these things cannot un-happen. They are all single, unrelated events. One does not cause the other, rather one triggers an action that causes the other.

So why wait until month end to count them and do the complex math over potentially massive data sets and make everyone feel that month-end anxiety ? Maybe, counting and math-ing each day would be less of a strain (in both compute and mental terms), and then your month-end would be just summing all the counts and making pictures, no ?

We’re not talking Stream Analytics here, just some sensible use of overnight compute resource. Cloud computing considerations come into play here – it may be more cost-effective to have one massive processing event every month than do 28+ shorter processing events (given capacity-on-demand, it’s a thought) – but on-premises you’re already paying for the servers to be up (with their fixed compute resources and licence fees) – why not look in to using them differently ? Smaller, more frequent batches could save you time, money, and anguish.

Maybe then people would relax about month-end because it’s just maths and not data movement and cleansing that could go wrong. That ‘rinse-and-receive’ process happens every day and is easier to fix if it goes wrong (if you design it to be robust and recoverable).

Who knows, maybe the C-Suite reports would be available sooner ? And that’s a win, right ?

Back soon…

Networking

Gatwick is in Sussex (UK) and is internationally notable for it’s airport. At one time (and probably still now – I just can’t be bothered to check because I have a blog post to write), it was the busiest single-runway international airport in the world.

Around airports, you traditionally get lots of freight businesses, shabby hotels, and the occasional industrial or technology park. In the late 1990s the environs of Gatwick Airport held hosted a curious tenant – the Motorola University – where for one week I found myself learing all sorts of interesting things about TCP/IP, WAN operations, RIP… yeah, it was a blast.

One of the things the class did was to build a worldwide WAN (in a lab, obvs, not in real locations) because Motorola made some rather fun WAN/LAN routers at the time and they presumably wanted to drum up some business by training people how to use the kit their company had just bought without asking them. Hey, it *was* the 90s !

Again with the 90s thing, people (ie TelCos) were moving from boring, old, leased lines that smelled faintly of used engine oil to flaky, variable-latency and yet squeaky-clean and fresh Frame Relay networks, presumably for their own ends. The class was to be taught in 2 sections: the first part of the week was using the trendy Frame Relay stuff, and the latter part using the old, fuddy duddy X25 networking that no-one was using any more bleh eww nasty X25.

This is going to be a longer post that I originally thought. Bear with me, though. I hope the payoff’s worth it…

In the days of BNC ethernet, you just wired up and went. Actually, that was *much* trickier than it sounds, but you get the idea – it’s relatively simple. On the course, the class were assigned a router each (I think I had Singapore) to configure with a name and an IP address that we had designed the topography of – all class C addresses for us ! (the nerds among us will understand), we plugged in the wire and… nothing.

After Turning It Off And On Again, it was noticed that no protocol has been attached to the interface ! Oopsie ! How we chuckled at our own ineptitude as we dutifully assigned Frame Relay to the inbound and outbound interfaces and sat smugly back. Our routers could now chat TCP/IP all day !

Actually, no they couldn’t. See, Frame Relay was/is an a**hole. Yes, you can send packets out of an interface, but to have a catcher at the other end needs an add-on (or Annex, in Frame Relay). So to get the routers catching packets that another router spit out, you need to apply Annex A.

Please note: I forget the acutal Annex notations. You can look these up if you want, but they all exist and are required, so I’m just going to go alphabet on this one for now).

So, your routers are now sending and receiving packets over Frame Relay (henceforth referered to as ‘WTF’), but do you have a functioning network ? Can you ping a known IP address on your network ?

No.

WTF ?

That’s a real WTF, not the acronym above, btw.

In life, four things are guaranteed : Life, Death, Taxes, and Lost Data. Guess which has occurred ? That’s right, networks are flaky AF sometimes, so things go missing. We should really apply Annex B to WTF which adds a Frame Check Sequence checksum to each packet. This means that incoming packets can be inspected to see if they’re in the same stream, and rejected if a packet comes in out of sequence.

Hang on, that’s not what we meant – if one packet takes the long way round the network we don’t want to bin the entire stream, so we need to install Annex C to WTF that caches entries until the next one in the sequence is recieved and then sent to the application in order. Phew.

Quick question: Do you think we have a functioning network yet ?

If you answered ‘Yes’, then I remind you – WTF. Sorry – “Frame Relay”.

The reason the answer is ‘No’ is, what happens when a packet really *does* go MIA ? It happens a lot more than you might be told by your network admins. With WTF Annex Whatever-we-applied-last-to-get-the-damn-thing-working, there’s no way to request a missing packet from the router that originally sent it, so we need to apply the retransmission Annex to WTF to be able to request missing packets from the originating router so we still have a fighting chance of getting this cat gif to where it needs to be.

At this point, things get a little hazy. It was late Friday lunchtime, and people were starting to think about the long drive home. I can’t remember if there’s a further Annex to WTF to enable a buffer on the sender to create a pool from which retransmissions can be requested, and I’m fairly sure there’s an Annex before Annex A to enable acknowledgments, but we’ve come so far and I think I’ve made my point.

And it’s this:

Before you do *anything*, be that on-prem or off-prem, take a *VERY LONG TIME* to understand what you need from your network, what the access paths and controls will be, and make sure you’ve covered everything on paper before you even go near any actual configuration. That includes before buying servers, designing applications, spinning up IaaS, whatever. You can acquire all the compute you want, but if you can’t connect it to anything, it’s just a line on your OpEx.

If you get your network wrong, at best it’s a pain to put right and at worst it’s company-killing security breach. If you have to jump though hoops, know what those hoops are, document them, adhere to them, and *never* play fast-and-loose with a firewall.

If you need an indication of the criticality of getting networking done right up front, try provisioning a VM in Azure. Networking is part of the bunch of questions, sure, but before you can do anything, you need a vNIC, and everything else hangs off that.

Thanks for reading. Back soon…..

Oh, and in case you were wondering, assign the X25 protocol to the ports, and it just works (Not Annex G that provides the abiltity to do X25 over WTF – that’s nuts). Just because it’s old, doesn’t mean it’s rubbish..

Sometimes, you just have to write it down.

A while back, I was looking into a very specific problem with SQL Server Database Mirroring. Yes, I’m aware that Availability Groups are now all the rage. I *did* say it was a while back.

Anyway, I keep getting asked about how database mirroring works, and what with remote working being the order of the day, I thought I’d write it up. Or down. Whatever. This very specific problem was that although the Principal had moved, connections from an application had not.

Let us begin….

Assume we only have one application server. Let’s call that AppServerA.

For this scenario, we have a SQL Server database that is protected from hardware failure by High Safety (Automatic Failover) Database Mirroring.

For that to work, we need 3 servers:

One to act as the principal to be the active database. Let’s call that DBServerA.

One to act as a standby in case DBServerA fails for some reason. Let’s call that DBServerB.

One to act as a witness to the database mirroring session to keep track of which server is acting as the principal, and to arbitrate which server should be the principal in the event of either of the other servers failing. Let’s call that DBServerC.

It is important to remember at this point that AppServerA does *not* know about DBServerC – it has no need to.

In the setup for database mirroring, AppServerA needs a connection string to set up the initial connection to whichever server is the current principal. This is done by adding parameters to the connection string that AppServerA has within its locally stored configuration.

The connection string contains “Server=DBServerA,50505; Failover_Partner=DBServerB,50505”. This tells AppServerA to attempt a connection to DBServerA on TCP port 50505, and in the event of that being unsuccessful, AppServerA should attempt a connection to DBServerB on TCP port 50505. As per best practice, you’ll note that a custom port is used. It’s not massively important what the port number is right now, it’s just one that’s available.

In this way, the *initial* connection to the database should always be directed to the active principal.

In the scenario, AppServerA successfully connects to DBServerA on the first attempt. What happens then is DBServerA *updates* the information for the “Failover_Partner” that AppServerA originally got from its connection string to whatever DBServerA has recorded in its internal tables. In this case, the partner information is stored as DBServerB\INSTANCE01, not as DBServerB,50505.

At this point, this is of no consequence, as DBServerA is up and functioning correctly. If DBServerA fails and SQLServer on that server is not contactable, AppServerA drops all its existing connections and attempts new connections using its connection string to use DBServerB as the principal. When DBServerA is available again, it assumes the role of the standby (mirror) in case DBServerB fails.

The application on APPServerA is using connection pooling. Connection pooling saves AppServerA resources (CPU and memory) by re-using a connection to the database if one already exists – the initial connection is *not* dropped and recreated, it is simply reset by the next application call that needs a connection to the database. There is a timeout value for idle connections in the connection pool that will drop unneeded connections. At this point, the original connection information has come from DBServerA and has DBServerB\INSTANCE01 as the failover partner.

Imagine that for some reason, there was an interruption in communication between DBServerA and DBServerB. There will be an event recorded in DBServerB’s logs that indicates that the connection between the two servers had been down for 10 seconds or more. In this situation, DBServerB contacts the witness (DBServerC) to confirm that DBServerC can contact DBServerA.

If the witness confirms that DBServerA is online and contactable, DBServerB remains as the standby and data that needs to be transferred from DBServerA will be queued on DBServerA until the link between it and DBServerB is restored.

If the witness confirms that DBServerA is indeed unavailable, DBServerB assumes the role of principal and data that is required to be send to the standby is queued until DBServerA comes back online. When DBServerA comes back, it assumes the role of standby until such time as DBServerB becomes unavailable.

The specific problem that I was asked to look into was that although DBServerB became the principal (as designed), connection pooling on AppServerA meant that the connections to DBServerA were not dropped, merely recycled. Connectivity between AppServerA and DBServerA was *not* interrupted in the same way as between DBServerA and DBServerB. It is not known at this point what the threshold for idle connections to be dropped from AppServerA’s connection pool is, but it assumed to be longer than the 10 seconds that it took for DBServerB to assume the principal role. Did I mention that AppServerA is super-massively chatty ? No ? Well, it is, because Developers.

This is where the distinction between ‘DBServerB,50505’ and ‘DBServerB\INSTANCE01’ becomes important. The former is a direct request to a specific TCP port on DBServerB, the latter is a request for DBServerB to provide the TCP port number for the instance named INSTANCE01. To serve this request, AppServerB issues a request to the SQL Browser service on DBServerB to do the instance name to port number translation.

The SQL Browser service (in *all* installations of SQL Server, not just those for database mirroring setups) listens on UDP port 1434 for requests.

On many firewalls, port 1434 (all protocols) is not open, so any request to UDP 1434 from AppServerA would get blocked by the firewall and not succeed.

So, given that the initial successful connection to DBServerA replaced the partner information of ‘DBserverB,50505’ with ‘DBServerB\INSTANCE01’, any failure of connection from that point on would result in AppServerA attempting to connect to ‘DBServerB\INSTANCE01’ in the event of DBServerA being unavailable.

This attempt would not succeed because of the firewall rules.

Even when DBServerA becomes available again, it is now functioning as the standby and so cannot serve data requests. If the connection is reused from the connection pool, AppServerA will still be trying to contact DBServerB on UDP port 1434 for the instance to port information. Only when DBServerA becomes the principal again will connections from AppServerA be successful if those connections are coming from an application pool that has not been restarted or otherwise cleared.

There are several ways to recover from this condition. You can bounce DBServerA, bounce the SQL Server instance on DBServerA, or recycle the application pool on AppServerA.

I favour the latter, because there’s NOTHING WRONG with any of the SQL Servers.

And for once, it’s *not* DNS…

Absolutely stunning

Virtual Machines. VMs. Guests. Hypervisees.

Everything’s gone virtual, and we *love* it. We get unlimited power over unlimited compute resources, and in IaaS/MPC a nice familiar GUI so all the Windows SysAdmins feel like they’re contributing.

I’ve never been much of a fan of running SQL Server on virtual machines from waaay back – it used to be touch and go whether your server would stay up, and it wasn’t really clear whose fault it was when it went down. Thankfully, things have moved on: virtualisation is a mature discipline, software is now built to run anywhere, and the location and physical existence of a server are not things we need to worry about right now.

So, assuming we want to do things the hard way and set up a managed private cloud on-prem so the Windows SysAdmins don’t have to worry about networking and firewalls, let’s create an estate of VMWare hypervisors and run *lots* of Windows guest VMs on them. Super. All tickety-boo.

We’re going to want backups with that so let’s leverage all that VMWare goodness, like snapshotting disks once they’ve been quiesced so we get a consistent backup. Hell, let’s live life on the edge and NOT quiesce the OS drives – you know, for fun !

We now have Windows servers (don’t care where or how) and we’re going to put SQLServer on some of them. Don’t worry, it’ll be fine. Trust me. We’ve moved on. Tell you what, we won’t backup the disks your MDFs, LDFs and NDFs are on. Happy now ? Good.

You might very well think that all is well in the world. Your servers are serving, your users are using, and your apps are…. apping ? Whatever. And it’s all being backed up and the DR people are happy.

You are fooling yourself.

Let’s take a closer look at what VMWare actually does when it takes a snapshot.

To get a consistent backup while the OS is running, VMWare uses the Windows Volume Shadow Copy service to create the snapshot. IO requests are then queued somewhere (I’m not *that* into the nuts and bolts of this) while the backup is taken, and then the snapshot is removed.

It is in this removal phase that the queue of stuff that’s waiting to be written to the VM is applied WHILE THE MACHINE IS STILL RUNNING. And what does VMWare do to make sure your hyperactive VM takes a breather to allow those writes ? It ‘stuns’ the machine – the digital equivalent of giving your server a quick slap in the face and telling it to calm the F down. It may need to stun the server several times to complete the outstanding writes.

Don’t worry, though. All of this stunning is for very short periods, and there are checks to make sure it won’t have an impact, right ?

Again, you are fooling yourself.

What VMWare does *before* it starts to stun a server is estimate how long it will take to apply the pending updates. If it thinks that will take less than 15 seconds, it will go ahead (note: It normally takes milliseconds). If it thinks the operation will take more than 15 seconds, it waits for 10 seconds and then has another go.

VMWare will do this estimation process 9 times. On the 10th, it just gives up and gives your server a really hard slap (however long it may take to apply the changes) so that it can complete the job and move on. Your server will be stunned for however long it takes. End of. You can read more about it here. The example quoted is 403 seconds, or a little over 6.5 minutes in old money.

As you might imagine, as a SQLServer DBA with high-transaction applications, I’m not 100% happy with my precious being slapped about or otherwise Tasered just because someone went and spent all that time and effort doing an on-prem MPC.

VMWare themselves have admitted that if you’re running busy SQL Servers on their software, you probably shouldn’t use their backup process and use an agent-based solution on the guest instead (Section 3.8.2.5 refers – and yes, the numbering sequence in the document is truly messed up).

Virtual Machines. VMs. Guests. Hypervisees.

Be careful what you wish for….

On withdrawing my session from SQLBits

This morning, I shared this on Twitter.

This post is some more information…. I’m not sure it is information, it’s more about feelings, anxiety, etc. You know, standard 2020 stuff.

The response from the #SQLFamily has been overwhelming, and affirms my long held belief that SQL Server people are among the very best people in the world.

[apologies for the next bit – it’s taken from messages I’ve already sent privately but I’d like to share the text here and expand a bit]

It’s not been the best of years to be honest, and on a personal note 2019 ended badly with a bereavement followed by another in the early part of this year. I’ve been a bit adrift work-wise for a couple of months, and the ever-changing times we find ourselves in meant that I was lurching from one thing I couldn’t control to another thing I had even less chance of dealing with.

It’s weird, all this stuff. Lockdowns, home schooling, home working, and international issues have forced me out of my bubble and put many terrible things front and center. I thought I was doing OK. Sure, there were good days and bad days, but on balance, things weren’t too bad.

I didn’t realise how much it had all got to me until I started to prep for recording my session and I just couldn’t do it. Like, total freeze. I couldn’t even sit at the keyboard and read what I’d done previously. I felt that ‘No one wants to see this. What *on earth* were you thinking??’. I guess part of the anxiety was seeing all these great people lining up great content, and I just *knew* I couldn’t live up to that. I’m not sure I could have given SQLBits the content they deserve.

Another large part of that was that with other things going on in the world, I couldn’t help but feel, well, helpless. So, although my heart was very much in it, my head just wouldn’t put all this useless noise to one side and then it kept telling me that I couldn’t do it even if I tried every time I even thought about it.

It’s not like I’m broken or anything, but I needed to make one change so I can make others. I’m really sorry it had to be SQLBits, and I feel a bit like I’ve let you all down. Trust me, I’ll get over it and I’ll be back presenting as soon as it’s right.

Admitting this on Twitter is the best thing that happened this year. Well, maybe not the actual best, but it’s in the top two. Top three, definately. After eating the frog this morning and actually formally withdrawing, I’ve admitted to myself that I can’t deal with it right now and it turns out that is OK. It’s also removed that voice that tells me I can’t do other stuff either. This is a Good Thing, because now I can get back to doing what I love.

So, thank you for reading, thank you for having my back, and most importantly thank you for being such a supportive part of my life. I hope to pay that back in the future.

 

How to **** everything up

I originally wrote this back in September 2019 on another blog platform, but thought it worth reproducing here because 2020…

So, apropos of a complete breakdown in service, here’s a few handy tips and action points to make sure your legacy lives on in the most psychologically painful way possible.

They also have the useful side effect of demoralising your team so much that they actively consider leaving a good job and a great employer because they’re unchallenged, unvalued, and unheard.

  1. When choosing a new technology for your product or service, actively consider how badly you messed it up last time, and seek at all times to replicate those ‘winning’ strategies.
  2. Consider how the new technology can be bent to your will.
  3. How will you hobble performance ?
  4. How will you make your new design less resilient ?
  5. What’s the lowest level of function that you can provide and could you push that bar even lower ?
  6. When putting together your team, get the brightest minds and force them into scoping meetings that last for weeks but don’t actually have an output.
  7. Getting money to do things is hard – always try to foster a ‘who will pay for this?’ mindset.
  8. Embrace new ways of working, but only as far as they support and perpetuate your existing business practices – no one likes change.
  9. If change *must* happen, make sure it’s delayed for as long as possible, but announce delays in small increments (or better still, not at all).
  10. If your customer is important, make sure that everything is process-centric – consistent customer service is better than good customer service.
  11. You’re a person outside of work – when you enter the workplace, don’t let on that you’re a functioning member of society.
  12. Never ask a direct question – words mean far more than actions.
  13. If buying a product off-the-shelf, make sure that customisations you implement mean that it can never be upgraded – you made it better, so that’s how it’ll stay. This is especially important if considering SaaS.
  14. Ask your user community what features of technology they’d like to see, then implement policies or processes that mean those goals can never be reached.
  15. In the event of your solution being accidentally viable, create management and authorisation frameworks so labyrinthine that people give up trying to request enhancements or alterations.
  16. Remember, even a delivery of 10% of the promised functions equates to a 100% successful delivery.
  17. Celebrate success with members of staff who had little or no input and who in all likelihood were not even aware of your project.
  18. If you’re making progress, claim that one in front of senior management – your team is all behind you.
  19. If you’re not making progress, micromanage. Two-hourly status calls are a great way to focus the team on solving problems and removing time-sapping blockers.
  20. When preparing presentations, consider your audience. Put as many words on a slide as you can, and ensure each slide contains information on at least three unrelated topics.
  21. When presenting, always stand behind your screen. Make sure to say ‘I’m not just going to read from the slide’ and then read from the slide.
  22. Block-book 15-seater meeting rooms well in advance, then use them for 1-to-1’s every other week.
  23. Bookend each workday with all-team meetings with titles like ‘DIARY-BUSTER’ and “ALL MUST ATTEND’.
  24. Send those awkward ‘bomb-shell’ emails out of hours. Copy in as many people as you think might make it look important.
  25. Above all, remember there is no ‘I’ in ‘Team’. There is an ‘I’ in failure, but that’s not an option.

 

This is by no means an exhaustive list, but hopefully they can help you be a better version of you if you’re into that kind of thing. Or you could just have read it, recognised some behaviours, smiled nervously, nodded sagely and moved on.

The decision, as they say, is YOURS !

Hunting high and log

Sometimes, you just lose stuff. There. I said it.

Well, maybe not ‘lose’ as such, but database files can be put anywhere by anyone with permissions, and that’s not always a good thing. Keeping everything nice and Kentucky is a DBA trait, so how do you track down where things are ?

You can use WMI (oof), PowerShell (oofoof) or maybe even SQLServer DMVs (tell me more).

I’ve never really got to grips with PowerShell – its on my list, for sure – but there are some occasions where PowerShell might not work for you, for example in restricted environments where you don’t have access to the OS directly. Here’s a way of getting database and mount point space information if you only have ‘public’ access to master…

DMVs are one of my favourite things ever. There’s just so many of them these days and we’re getting more all the time. My *other* favourite thing is CROSS APPLY – also known as the ‘fairy join’ – because I don’t know what it does and yet I can use it to get information that might be hard to join up otherwise !

Let’s put these two to work together to get a list of databases files, where they are, and how much space there is free on the disk where they reside:

--All database mdf mount points and disk free space in MB
SELECT DISTINCT db_Name(f.database_id) AS  [Database], volume_mount_point, (total_bytes / 1024) /1024 AS [Total MB], (available_bytes / 1024) / 1024 AS [Available MB]
FROM sys.master_files AS f 
CROSS APPLY sys.dm_os_volume_stats(f.database_id, f.file_id)
WHERE f.file_id = 1;

 

Database volume_mount_point Total MB Available MB
tempdb H:\SQLTemp\ 557733 545029
MyDB1 H:\SQLData\ 12046999 346111
MyTestDatabase H:\SqlData\ 12046999 346111
OtherDB1 H:\SQLLogs\ 334638 142694

 

You’ll see here that I’ve taken the easy route and used file_id = 1. That’s the MDF – the primary database file – and you *must* have one of those. Oh, and I can see that someone created a database with its MDF on a mount point intended for transaction logs. I’ll follow that one up later…

A small tweak, and you’ve got the same information for log files (file_id = 2, the LDF);

 

Database volume_mount_point Total MB Available MB
tempdb H:\SQLLogs\ 334638 142694
MyDB1 H:\SQLLogs\ 334638 142694
MyTestDatabase H:\SQLLogs\ 334638 142694
OtherDB1 H:\SQLLogs\ 334638 142694

 

That’s better – everything’s lined up nicely and where it should be, except maybe tempdb’s transaction log appears to not to be on dedicated disk. Another item for the backlog…

You might be wondering why I’ve used the DISTINCT keyword in the script. That’s because if you have multiple NDFs (secondary database files) you only really want the one entry back. I haven’t included a script for NDFs (file_id > 2) because I’m not in control of where *you* put them. Besides, giving you everything takes some of the fun out of it, don’t you think ?

 

Embracing chaos

A choice to be made

A long long time ago, in a city far far away, I made a decision that kinda changed my life in a big way.

Being slightly wierd about stuff and wanting to be free of the tyranny of choice, I threw out *all* my socks, and bought 12 pairs of identical black ones.

I was just at one of those stages of life where I thought that I was spending entirely too much time pairing socks after washing them. So, replacing all of them seemed like a good idea. “Any colour, as long as it’s black”, right ? And black goes with pretty much anything, right ? To this day, I only have 2 colours of socks (black and grey), and they are of different styles, so I can’t get them mixed up.

This only kinda changed my life. There was another incident a few years back that changed my perspective quite dramatically.

An epiphany of sorts

It was on a training course – 5 days of Sharepoint 2013 installation and configuration – and, as you might imagine, we tended to have long tangential discussions to relieve the tension and create a human bond in the classroom.

The instructor asked if anyone had any ridiculous rules to live by. I mentioned the sock thing, naturally…

And this is where the game levelled up dramatically. The instructor revealed that *he* had the *same* inclination when it came to socks, but his approach to finding a solution was so different, so radical, that it changed my life.

His theory was that, even with careful planning, socks wear out and need replacing, thereby introducing variation in foot attire that was almost, but not exactly, the same as the original problem – socks that don’t match and need to.

So he bought his socks online, in packs of three, that were *guaranteed* not to match. He was happier for bringing more colour into his life, and was still freed from the tyranny of choice – socks ain’t gonna match, so why waste time worrying about them ?

Life-changing times

Once you accept that there’s always going to be some variation in a given scenario (and also accepting that any variation has the capacity to drive you insane), perhaps embracing chaos is the way to go. It’s quite liberating to realise that there are certain elements you can’t control, but you can influence some elements that you have no control over by putting them in a ‘no control’ bucket. Let me contextualise that a bit more….

We’re rolling out Microsoft Teams to update our collaboration and promote new ways of working. Nothing to do with Skype For Business being sunsetted, no, nothing to do with that at all…

One question that comes up fairly frequently in the forums is the one about being able to delete chats – something that at time of writing we’re not able to do. We can delete individual comments, but the chat lives on. It just disappears off the ‘recents’ list.

If you’re the sort of person who likes a tidy mailbox and keeps everything in apple-pie order, I can see that this might cause some people ‘issues’ in the same way as socks did for me.

However, the persistance of chats and messages in a searchable log is a Very Good Thing. Hands up if you’ve ever searched emails or other electronic conversations for a half-remembered detail or needed to prove how a decision was made ? If the conversation had been deleted, there would be nothing to go search in and personally I’d spend quite a bit of time wondering if I’d just imagined something…

The fact that I can’t delete chats or messages in Teams is therefore in one of those ‘no control’ buckets – I accept that I can’t change it, but I know I can leverage the features that a lack of control actually brings by being able to search the archive. And that has to be a win.

What’s next ?

To be honest, I don’t know what’s next. There’s a lot of choice in the data world at the moment. So much choice that to some it’s daunting, to others completely overwhelming. Have a read of this blog, and tell me that the infographic doesn’t trigger at least *some* anxiety – how are we supposed to make architectural sense of that ?

Not sure, but I’ll be embracing that ‘chaos’ and looking for what it can deliver for me, my colleagues, and our services.

Thanks for reading.