Tuesday, 17 September 2013

Intersystems Caché performance, woe is me...

I admit I am no Intersystems Caché guru, I am looking at 3 years working with the software. I have attended the sparse available training and I have been left to explore this weird universe known as Caché.

Now if you read their website you will read the tag line "the World's fastest, most scalable Object oriented database". More over they make even greater claims of "...it can run SQL faster than relational databases. "

Every training session seemed to pad every sentence with "so fast" and "world's fastest" although I was never sure if they were trying to convince me or themselves of this "fact".

When using SQL databases for reporting data I have never really encountered issues, my SQL has normally been suboptimal fire and forget code, I make the query get the data and if the full page response including the query is less than 1 second it is not worth optimising.

Now while 99% of my MSSQL and MySQL queries have always been so quick I have never needed to revisit them with Intersystems I have had to go back again and again to optimise everything, completely re-writing a ton of code, turning a single line of SQL in to an intricate For loop absolutely minimising memory use. Yet the makers continue to bang on about performance, but to my knowledge only write performance is reasonable, read performance is SLOW.

I thought it would only be fair to benchmark the process to see if it was possible to match the performance of two of the most common SQL databases.

The following query is based on a "real world" query and one of the first that I experienced slow Cache performance. I wrote a summary page which contained several aggregate queries, and to my surprise it was very sluggish. The page in question ran between 10-30 aggregate queries, and the tables containing the data ranged from 100-100,000 rows.

Query
SELECT LastName, count(*) AS Total
FROM Users
ORDER BY DESC Total 

I consider that a reasonably simple, but a reasonable example of a commonly run query.

Test Data
10,000,000 records

I chose this number as it should help highlight real performance differences and helps establish if it truly was more scalable.

Results
MSSQL 0.061 seconds
MySQL first query in 10.4 seconds and subsequent queries in 0.003 seconds.
Caché *as SQL(/w bitmap indexes) first query in 62.83 seconds and subsequent queries in 33.27 seconds.
Caché *as $Order through the index global 48.17 seconds first query and subsequent queries 48.17 seconds

MSSQL unindexed 4.4 seconds

I hope you will agree the performance figures are WAY off SQL performance.

Next test to line up is write performance, as Cache does not need to be "ACID" I suspect it has a chance at winning, but nothing is guaranteed...

1 comment:

Tony said...

We're trying to implement this massive turd in our healthcare company atm. It's worth noting that leadership ignored ALL of the SMEs and went with this solution anyways.

We're three months into it now and it's at least 5 times slower than what we had before. It does an INSANE amount of processing for microscopic output. It uses the old POSIX standard for timezones which they don't include in their considerations in the set up doc. I had to stack trace crappe to find that tender little nugget 0_0. 20,000 system calls to /etc/localtime per minute!

The specs on storage is sub ms which makes sense considering the thing is slower than anything I've had to privilege of working with.

The documentation doesn't always match the product.
For example, changing several memory related settings says you don't have to reboot it when reading the help docs included with the install.
However, the documentation online says you must reboot it.

AD integration is terrible and their support staff told me no one had ever implemented it there. With that being said, I think their support staff rocks :).

It's also worth noting that the language used in their integration is different from EVERYTHING I've ever seen and appears to do the opposite of what's intended when you toggle check boxes.

This Intersystems Cache is comically terrible when it comes to utilizing memory. If you read their propaganda, they say that's better at allocating memory than standard databases... nothing could be further from the truth. We've had to do an INSANE amount of tuning just to get this thing up to unacceptable levels of performance.