Monday, 11 April 2011

Efficient Age From Date of Birth Calculation

I have been playing around with MS SQL and I was wondering what would be the most efficient method to gather a users age.

Calculations involving dates should always be considered from their most tricky state, if you consider that you might need to take into account UTC time, leap years, time zones it all starts to add up to complexity that you can too easily trip up over. 

So my first thought when someone asks me to calculate the age from a DOB field is to "cheat" and use the built in languages tools. Most languages have some form of datediff function that can be used to calculate the time between two dates.  However is this going to be the most efficient method?

I thought I would test a couple of methods on MS SQL to see the performance changes.  All tests were performed on a low spec laptop on MS SQL Express on a set of 100 random date of births

First method

SELECT floor(datediff(day,[dob],current_timestamp) / 365.25)

Well this is nice and succinct but does suffer a bit from the fact that DATEDIFF rounds up.  I would love to just do DATEDIFF(year... but this will only provide the correct age once per year)

This method gave me the answer in 0.296 secs

Second method

SELECT (convert(int,convert(varchar(8),current_timestamp,112))
 - convert(int,convert(varchar(8),[dob],112)))/ 10000

Well some basic maths relying on Japanese date format gives me the same answer in 0.343 secs

Third method subtract the years from each other and then reduce by 1 if the

 SELECT CASE WHEN convert(int,substring(CONVERT(varchar(8), [dob] , 112),5,4)) >= convert(int,substring(CONVERT(varchar(8), CURRENT_TIMESTAMP , 112),5,4))
THEN convert(int,substring(CONVERT(varchar(8), CURRENT_TIMESTAMP , 112),1,4)) - convert(int,substring(CONVERT(varchar(8), [dob] , 112),1,4)) -1
ELSE convert(int,substring(CONVERT(varchar(8), CURRENT_TIMESTAMP , 112),1,4)) - convert(int,substring(CONVERT(varchar(8), [dob] , 112),1,4))
END
 This method took 0.420 seconds.

Looking at this the first two methods are for all intents and purposes identical to just the selecting the raw dob value, which indicates to me that there is very little efficiency to be gained in this process.  Even my messy CASE version is pretty swift.

Although it is fun to do this in the database it is really much easier to do this at the higher level and intial testing of this suggest to me that the overhead is so minimal that the methods I have outlined are only really useful for quick reporting when you just need the calculation and have an available SQL interface

No comments: