08 November 2012

From Sys-Admin to DBA

With the acceptance of a new position as a DBA at a new company, I shall be bidding adieu to systems administration in a couple weeks and focusing primarily on my beloved SQL Server. Goodbye, sysadminning, I hardly knew ye...or rather, I knew every bloody inch of ye, from yer head to the blistering pustules on yer arse, as they say. Still, my experience of the past 10 years has been broad and will continue to be quite useful I am sure, so I'm thankful for my time as "the server monkey".

My transition from small business "accidental DBA" to enterprise DBA will be fraught of course with many lessons, an absolute minimum of them, let's hope, through trial and error, and those few firmly residing in test environments, not production. But I'm keen to get started; time now to do that happy, laborious task of documenting everything out here for my departure.

The PASS Summit is well underway in Seattle at present; hopefully I'll manage to get out there next year. "Summer camp for DBAs", indeed!

24 October 2012

Becoming Slightly More Normal

Today's idea for the blog regards inching towards normalcy...that is, the main three normal forms of database design.  I always get a little fuzzy in the head when I think about the precise official definitions of the three, and generally examples are the best way I have of comprehending normalization.

So let's start out with a sample database.  As this is my blog, and the apparently absent readership has put up no protestations, I shall choose Hindustani classical music as my subject; a database of ragas.

Raga
Name Thaat Mode Performance1 Performance2
Yaman Kalyan Lydian Ustad Imrat Khan Hariprasad Chaurasia
Darbari Asavari Aeolian Ustad Vilayat Khan Dagar Brothers
Chandni Kedar Kalyan Lydian Ustad Vilayat Khan Ustad Amir Khan

So we can see from this that we've got a record of each raga, listing what thaat each is classified into, what Western mode that thaat is equivalent to, and a couple example noteworthy performances of each raga.  The main problem here that prevents us from entering the first normal form (1NF) is the multiple values we have for example performances.  We could have issues if there were more or less than two performances to be provided for a raga, for instance.  So to get it to first normal form, we combine the performance columns into one column, since they are storing essentially the same information.

Raga
Name Thaat Mode Performance
Yaman Kalyan Lydian Ustad Imrat Khan
Yaman Kalyan Lydian Hariprasad Chaurasia
Darbari Asavari Aeolian Ustad Vilayat Khan
Darbari Asavari Aeolian Dagar Brothers
Chandni Kedar Kalyan Lydian Ustad Vilayat Khan
Chandni Kedar Kalyan Lydian Ustad Amir Khan

Alright, so now we have it in first normal form. However, we doubled our row count now, and seem to be storing a lot of duplicate information in our table. Since the relationship between a raga and a performance is one to many, we need to split out the performances onto their own table in order to achieve the second normal form. A foreign key (between Raga.Name and Performance.Raga) can be used to connect the data across tables.
Raga
Name Thaat Mode
Yaman Kalyan Lydian
Darbari Asavari Aeolian
Chandni Kedar Kalyan Lydian


Performance
Raga Performer
Yaman Ustad Imrat Khan
Yaman Hariprasad Chaurasia
Darbari Ustad Vilayat Khan
Darbari Dagar Brothers
Chandni Kedar Ustad Vilayat Khan
Chandni Kedar Ustad Amir Khan

Having achieved second normal form (2NF), we note one additional point of redundancy. Fields that do not depend on the primary key (in our case, the raga itself) must be "offshored". Note that Mode is essentially derived from Thaat; it isn't really a unique value of the raga, it is more a further description of the thaat quality. So one more time:
Raga
Name Thaat
Yaman Kalyan
Darbari Asavari
Chandni Kedar Kalyan


Performance
Raga Performer
Yaman Ustad Imrat Khan
Yaman Hariprasad Chaurasia
Darbari Ustad Vilayat Khan
Darbari Dagar Brothers
Chandni Kedar Ustad Vilayat Khan
Chandni Kedar Ustad Amir Khan

Thaat
Name Mode
Kalyan Lydian
Asavari Aeolian

Hey presto, third normal form (3NF) in all its grandeur. It's a very clean and proper way to design databases, but working in the real world, I think it's worth remembering something stated by Karen Lopez at the KC SQL User Group last August: "Normalization is a description, not an objective" (paraphrased). There are times when third normal form is ideal, times when denormalized data is the way to go. The classic IT/DBA mantra applies..."It Depends".

23 October 2012

Locks, Blocks, and Deadlocks

The concepts of locking, blocking, and deadlocking are simple concepts on the surface, with quite a bit of complexity beneath the surface in the details.  But as a parent of two small children, these concepts are modelled on a regular basis.

Locking:
First, the query requests a lock on a resource.  So my son grabs a toy to play with.  He has acquired a lock on that toy, a marker of sorts saying "I'm playing with this right now".  I won't painfully try to thresh out the details of isolation levels and shared vs. exclusive locks as that would basically cause an already taxed analogy to implode.

Blocking:
Then my daughter looks at her brother's toy and thinks, I'd like to play with that toy.  Too bad...a lock has been issued, and your request to play with that toy is denied.  Blocking may end if my son lays the toy aside, thus releasing the lock on the toy.

Deadlocking:
While my son still has a "lock" held on the toy, and is blocking my daughter's request for it, my daughter picks up another toy, holding a lock on that item now.  Then my son decides he wants that toy, too, and requests it.  We have both kids, each holding a lock on a toy and blocking the other's request for it, wanting the other's toy.  Neither will relinquish the toy they have until they get the other toy and can finish their playing.  Deadlocking is this sort of stalemate that happens in the database/playroom where two queries get trapped in a kind of ill-fated interdependency, and the database engine/parents have to step in and choose a "deadlock victim" that has their query cancelled so at least one of the two can be satisfied.

At my rate, if I keep on torturing analogies like this I shall be receiving letters from Amnesty International.

19 October 2012

Windowing Functions Brainbender

An offhand tweet from Adam Machanic of sp_WhoIsActive fame had me toying with a seemingly simple T-SQL problem:

Imagine you have a table of sales orders.  The only fields we'll concern ourselves with is OrderDate and OrderID.  OrderDate obviously can have duplicate values, and OrderID generally should be unique.  Imagine you want to generate a query with the table records, but have an additional field that shows the next date that exists with a sales order.  For instance, if you have five orders that took place on September 1st, and the next orders in the system were on September 5th; the NextDate value for those five orders on September 1st would all be September 5th.

My first (successful) attempt at this looked like this:

SELECT
SalesOrder.OrderID,    
SalesOrder.OrderDate,
datequery.NextDate as NextDate
   
FROM SalesOrder
INNER JOIN
  (
   SELECT TOP 100 PERCENT a.OrdDate,
    lead(a.OrdDate) over (order by a.OrdDate) as NextDate
    FROM (select distinct OrderDate as OrdDate FROM SalesOrder ) as a
    order by a.OrdDate
  ) as datequery
ON datequery.OrdDate = SalesOrder.OrderDate

Essentially what is happening wiht this query is that I have a two level nested query.  Working from the inside out, first I select all the distinct OrderDate values into a set, thus removing the duplicates.  Then I use the LEAD() windowing function to find the next row value from that set for every given date.  Then finally I join that to a general query of the table, to get the NextDate value for any given row.  It works...but perhaps not very efficient.

SELECT
 a.OrderID,
 a.OrderDate,
 (
 SELECT (MIN(b.OrderDate))
 FROM SalesOrder AS b
 WHERE b.OrderDate > a.OrderDate
 ) AS NextDate
FROM SalesOrder AS a

In this version above, I found a much simpler way of providing the intended results without using windowing functions at all.  NextDate is provided through a subquery by selecting the minimum value of OrderDate that is greater than the current date.

Lastly, one more version:

WITH offsetCTE(SalesOrderID, offset) AS
 (
 SELECT TOP 100 PERCENT c.OrderID,    
 ROW_NUMBER() OVER (PARTITION BY c.OrderDate ORDER BY c.OrderID DESC) AS offset
 FROM SalesOrder AS c
 ORDER BY c.OrderID ASC
 )
SELECT
 a.OrderID,
 a.OrderDate,
 LEAD(a.OrderDate, offsetCTE.offset) OVER (ORDER BY a.OrderDate, a.OrderID) AS NextDate
FROM SalesOrder AS a
INNER JOIN offsetCTE ON a.OrderID = offsetCTE.OrderID
ORDER BY a.OrderDate, a.OrderID ASC


In this case, I create a common table expression designed to calculate the number of rows from one row to the next row that contains a different date value.  Using the ROW_NUMBER() function, and partitioning by date (ordering by OrderID descending), I get a reverse row count number for each group of orders per date.  Then I use that CTE in a query where I use the LEAD() function again, but this time, take the offset value I calculated to be the number of rows to skip ahead, the optional second parameter of LEAD(). 

More complicated, admittedly, and with all the permutations I've done so far, I'm not sure which of these would qualify as the most efficient solution.  Fun exercise regardless and my thanks to Adam for offering some guidance and prompting as I noodled through it.

11 October 2012

The Browser Service

Yesterday a developer friend (of whom Steve Ballmer thinks so highly) was bemoaning his company's recent loss of their Server Monkey, because his app was all of a sudden unable to connect to their SQL Server, but the server was up and seemed to be operational.

Once I learned he was connecting to a named instance, I had him check for the SQL Server Browser Service.  That was one of those sample questions that was on just about every sample test for the MCITP exams, so it was drilled into my head...sure enough, it was disabled.  If you're connecting to named instances (unless you provide the port number as well), you'll need the Browser service running to make sure your connection lands at the appropriate spot.

So a very simple tip today but one I was pleased to see make it out of the academic side and into practical usage, for me.

the browser service
is essential to connect
to named instances

08 October 2012

Detangling Nested Queries

As a one-time "accidental DBA", I both witnessed and committed a number of bad or inadvisable practices on my SQL Server.  We were a small shop, and early on the furthest thing from my mind was performance or efficiency.  There was one egregious infraction that's worth a mention here...

Views are definitely handy.  They can also promote and exacerbate developer laziness.  We had a view set up that would join order line and order header records, for a massive "SELECT *" style sales history output, with just about all columns.  Then someone came along and wrote a view using that view as a source table.  Then, someone else came along and wrote an SSRS report based on that second view.  So here we had a highly focused end query returning small amounts of data from a view pulling data from another huge view, pulling data from the source tables.  I've done a bit of testing and I'm not convinced that it creates as much of a performance problem as it would seem; it appears that SQL Server does the logical heavy-lifting of making it efficient even though the code is not.  I'd be interested to know otherwise though!

But the main problem is one of manageability.  With all these unnecessary dependencies we end up with a needlessly complex amount of code...two view definitions and a query, where only a single query was necessary.  Any changes to the view definition breaks the views and queries down the chain, in what could be a transparent manner.  So, one day when the office was quiet enough, I set about detangling these messes and rewriting the report queries to go straight to the tables, only pull the information they needed (removing the "SELECT *"...a bad habit that afflicted me in my early days, I confess).  Now when someone examined the report query, they could understand exactly what data it was pulling, without having to confusingly parse through the subsequent view definitions from which it was retrieving data.  This is especially useful when the end user questions data quality; in my case, I wasn't the report author, so I didn't have a full understanding of what the query was gathering.  If we write report queries as simply as possible and avoid the temptation to just tack it onto an existing view with just some additional filtering, it will help the next guy in the chain, believe me.

"The simplest thing that works", a good mantra for query writing.  Happy query detangling.

02 October 2012

Plotting My Learning Goals

So one of the things I've a bit of an ardent fondness for is developing (often obscure) skills and expertise.  I went through a period, let's call it a hangover, after school where I worked a rather monotonous job and was minimally concerned with sharpening and updating my technical skillset, but in the past couple years I've moved back towards genuine curiousity and interest in excelling in a technical sphere rather than just treading water.

My core technical interest is SQL Server; many years ago I was handed a SQL Server 2000 (then later, 2005) box and given basic duties...back it up, keep it running...normal small shop sysadmin duties.  Our databases at the time were out of the box MS Dynamics applications and so very little involvement on my side was required.  As time went on, management would get increasingly frustrated with the limited capacity of the embedded reporting tools, and I got more and more chances to write and flesh out reports in SQL Server Reporting Services.  Since then, my interest has deepened and the best part of that job and my new job are working in a DBA or reporting capacity with SQL. 

So, the past year with a handful of books (particularly Tom Carpenter's excellent guide) and a lot of curious poking around with both evaluation and, er, -cough- production servers, I shored up my knowledge and was able to pass both tests, fairly handily, for the MCITP in SQL Server 2008 certification.  Does we rest on our laurelses, precioussss?  No we does not, gollum, gollum!

So, next goal is to pass the two upgrade exams for the MCSA in SQL Server 2012.  I have a couple 2012 evals running, and whenever I'm playing with a SQL box I try to work with those particularly, to aid in getting up to par with new features, but at work we're still running 2008, as most folks probably are.  So I'm still preparing, and thinking about what book to procure, if any, to get ready for the certification exam.  But I'm planning 1st quarter 2013, at least.

Another book I've just finished is Don Jones' "Learn PowerShell in a Month of Lunches", which is a great format for an introductory course.  I'm no PowerShell guru but it has helped fill in some of the mental blanks I have when working with it previously, and I feel much more confident behind a PS prompt than I did previously.

From the library I found this Kimball Group book on data warehousing and BI, which was recommended to me by my boss, who went through Ralph Kimball's training and knew him rather well.  I've got precious little desire to work in depth in an OLAP/MDX/data warehousing environment, but perhaps part of that attitude is a fear of the unknown, so this is what I'm reading presently.

Queued up right behind that is an intro to Oracle for SQL Server DBAs.  As Wodehouse's (or was it an improv of Stephen Fry?) Jeeves said, it is well to know what tune the Devil is playing.

And of course, my studies continue aggressively in my lessons with Ustad Imrat Khan-saheb.  He is teaching me a lovely Alhaiya Bilawal, currently.  Here is his surbahar alap: 



28 September 2012

Friday Poetry Slam

My poetic output this week, as inspired by things I've run into at work lately.

"where'd my emails go?"
"you archived to your C: drive.
we don't back that up."

what the hell is this?!
full recovery model
but no log backups!

content in the cloud
fancy phrase but you just mean
stuff on a website

25 September 2012

Let's Break Something

Here's a fun little exercise for the bored DBA.  Break your database.

When I say your database, I don't mean your company's database, or anything anywhere remotely within earshot of a production machine.  But I have some sample databases on an eval version of SQL Server 2012 that would be prime sacrificial lamb material.  The goal is to simulate a worst case scenario and recover from it.  I've done this once before, but it has been long enough that I've forgotten exactly how I did it, so we'll give it a bash.

First, take a backup.

Then, reread that first step and make sure you did it.  I'll wait.

...

Backup taken?  Are you sure?  OK then, so next I'll need you to open You really did take a backup right?  Let's just make sure of that.  OK, I'll trust you at this point.  So next you'll need to open SQL Server Configuration Manager and shut your instance down.  Once the server is switched off (you will find it under SQL Server Services, right-click and click Stop on the relevant SQL Server instance, if more than one), you can have unfettered access to the data files, and we can do all sorts of unmentionable atrocities to them.

No time like the present, so browse to your data file, right click the database THAT YOU HAVE PREVIOUSLY TAKEN A BACKUP OF, AND THAT IS NOT AN IMPORTANT DATABASE IN ANY RESPECT, AND THAT IS NOT ON ANYTHING AKIN TO A PRODUCTION SERVER, and open in a text editor.  Highlight some text (any text!) and replace with something pithy...I have chosen "My hovercraft is full of eels".  Be creative, muck up as much of the data as you wish.

Now on to the fruits of our labours...restart the service in Configuration Manager.  Then launch Management Studio.  Look at that...our database with a "Recovery Pending" note affixed.  Data is unaccessible.  Hmmm!  Seems different slightly than how things reacted in 2008R2, so let's play around and see how it goes.

In the error logs:
Error: 824, Severity: 24, State: 2.
SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:9; actual 0:0). It occurred during a read of page (1:9) in database ID 8 at offset 0x00000000012000 in file '[path to data file]'.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

So it knows what we did.  Alright, fine, SQL Server, I'll do what you say.  dbcc checkdb returns "Database [name] cannot be opened due to inaccessible files or insufficient memory or disk space." Dandy! Thank you SQL Server for telling me to do what you then tell me I can't do.  It's data loss acceptance time, folks.  A quick restore of a full backup and we're back up.

Let's have some more "fun"!  This time, I repeated the experiment, but went in and changed one single character to another character.  Pretty mild change, innit?  Maybe SQL Server won't notice?  Same error, same severity level, same unrecoverable database.  So restored from backup again, and this time, I monkey with the log file LDF instead of the actual data file.  No errors reported and no loss of functionality (at least, apparently!).  But, my database was in simple recovery model.  Wondering if that had anything to do with it, I switched to full recovery model and corrupted the logfile again.  That did it, the reassuring "Recovery Pending".

Error: 9004, Severity: 21, State: 1.
An error occurred while processing the log for database [name].  If possible, restore from backup. If a backup is not available, it might be necessary to rebuild the log.

So, a slightly smaller severity number, good sign eh?  So first I execute ALTER DATABASE [name] SET EMERGENCY.  This puts it in an "emergency" state and turns the tan database icon to an attractive shade of pale red, indicating that SOMETHIN' IS WRONG RIGHTCHERE!!!   I tried running CHECKDB with the REPAIR_ALLOW_DATA_LOSS parameter, and it complained about it needing to be in single user mode.  So ALTER DATABASE [name] SET SINGLE_USER, and then DBCC CHECKDB ([name], REPAIR_ALLOW_DATA_LOSS).  Here's the relevant result:

Warning: The log for database [name] has been rebuilt. Transactional consistency has been lost. The RESTORE chain was broken, and the server no longer has context on the previous log files, so you will need to know what they were. You should run DBCC CHECKDB to validate physical consistency. The database has been put in dbo-only mode. When you are ready to make the database available for use, you will need to reset database options and delete any extra log files.

OK, so it rebuilt the log for me!  Switched it out of single user mode, and voila, we're back to normal.  Note that (as one might expect) the restore chain is broken, so in "real life" this would be a good time to take a full backup and start fresh with your log backups.

I've poked at SQL Server in an inappropriate manner enough this morning, so I shall give it a rest for the time being. 

24 September 2012

Nice Fluffy Backups, But Where's the Restore?

I am shirking tradition (as I am wont) (indeed, I am wonter than most) in abandoning my obstinate refusal to provide post titles.  There is a tide in the affairs of men, which, taken at the flood, leads on to fortune, and such tide apparently entails providing handy titles to blog posts (not nearly so laden with euphemism as you might assume).

Call me a sys admin by trade, and a DBA at heart...since 2005 I've been the de facto DBA for the companies I've worked for (which were small enough to not justify a full time DBA), and I have a bit of a maturing interest/passion for SQL Server.  This morning, spurred on by an excellent article over at SQL Server Central, I was beset with enough worry and consternation that I decided to test my backups.  The whole concept of a backup is a healthy lack of trust.  You take backups because you cannot trust that your data will survive in its data file containers...threatened either by data corruption, user malfeasance, or actual catastrophic happenings in the datacenter.  If you trusted your database, you wouldn't bother with backups, there would be no need for them.  So you take your regular full backups and constant log backups, like a good DBA ought.

Here's where so many of us, myself included, cast aside our mistrust and skepticism and become a rabble of dumbly trusting, credulous gits.  We require confirmation that backups are taken, but we trust that they were taken properly and can be restored without corruption.  Backup files are just another cog in the machine that can fail.  Any component can fail, and we must plan for any type of failure.  Simply trusting that once SQL Server Agent flashes us the A-OK sign after taking backups, that we are guaranteed the safety and security of our data, is not enough.

Couple o' links:

Referenced article: The Importance of Validating Backups

And the Mighty Ola Hallengren's db maintenance scripts:  http://ola.hallengren.com/

Also, managed to snag a new (used) sitar the other day, a 7 year old Shrishti model that was worked over by the inimitable Tony Karasek with a new delrin bridge.  Has a great strong sound on the baaj, and the chikaris have a very clean, closed jawari tone to them.  Was originally kharaj pancham but I've restrung it to gandhar pancham.

And that's it for this week.  Toodle pip!

19 June 2012

Well, I admit it is sheer, unadulterated obstinance that has me back here to update my blog. I'm fully aware that blogs, unless specialized in some sort of topic that generates general interest, are a dying medium, particular the personal blog which has been slowly starved out by the convenience of the MyFacePlusbook or whatever that other website is...

Nonetheless, archaic/outdated/unpopular things have always kept my interest and a certain conservative impulse to say "up your nose with a fish hook" (or however that expression goes) to the march of progress probably helps goad me to start blogging here more. I would say "in spite" of the utter lack of readership, but almost more "because", there is something liberating about being able to write babble with no concern that it need be of interest to others, necessarily. On with the prattle!!

My last post intimated the dawning of a new hobby, as I put it, with the acquisition of a sitar. "Hobby" might have been an underestimation of what has actually transpired over the past year and a half; I've become somewhat enthralled by (one might say, quite literally, in thrall to) Hindustani music. To understand Indian music, particular its treatment and systems of melody, will give one a window to all kinds of music and make appreciation and understanding of any culture's music a bit deeper. At least, such was the case with me.

Sure enough, starting to write something like this is the surest path to writer's ruddy block. Ah well, it isn't as if my paying customers expect great things from this blog. So that's it from me for now, will start adding to it as I see fit, most likely (knowing my interests of late) on occasional topics of SQL Server or Hindustani classical music.

Splunge, for me, too!