Intersystems Caché Performance - Part 3

While my investigations have still left me sad to find that there is no real way to match the read performance of an SQL database I have found out some interesting little pieces of information on Caché.

1. Bitmap indexes always appear to be faster than standard indexes no matter how many unique values there are.

2. Traversing standard indexes manually is the fastest way to return a query first time (second time bitmap index SQL is the fastest).

3. Single Insert performance is good

4. Reversing through an on disk array (e.g. $ORDER(^global,-1) ) is slower than going through it forwards putting the data into an in memory array and then reserving through it! To my surprise all my tests showed that it was at least a 50% performance improvement and as the data grows it can double. The only problem is if the memory array grows too big then you will get a store error as the memory runs out.

I am considering re-writing my queries to only reverse $ORDER in memory and checking the $STORAGE to make sure it is not running out, then writing it to disk if it becomes too large.

It appears to be the case that using SQL with bitmap indexes and potentially using triggers to make sure no values become calculated or require significant levels of joins is the only way to achieve reasonable read performance.

Hopefully this information will be helpful in to you in getting some vague level of performance out of the system.

Comments

Unknown said…
I'm absolutely baffled by Intersystems Caché. My manager tells me that it doesn't use regular indices and that's why it's so slow to pull records. We use an ODBC adapter to gather information from the DB. It takes like 6 hrs to get a dataset with less than 10,000 records. Any ideas on how to improve this?
Andrew Gill said…
I have tried using ODBC to gather records and I have noticed that there are improvements in performance and SQL support in later versions. If possible try to get the database upgraded to the latest version and it might help.

The most important thing to do is look at the properties of the query, if there are any "calculated" fields then these take ages to search, they never cache and basically I cannot see a good reason for ever using them.

The next thing to do is to index any fields that are being searched on. Also how you word your query can effect if it tries to use the index or the raw rows when returning results.

There are two types of index, normal indexes and bitmap indexes. All my testing so far suggests that bitmap indexes are faster and more worthwhile to use.

Finally in all my testing the query speed lags behind other major SQL databases, the insert performance is good, but get the data back out is not so fast.

With the information you have provided I can only offer the generic advise of
- bitmap indexes
- avoid or queries and lots of joins
- avoid calculated fields
- try to use exact matches rather than like queries

Without the code and the data it would be difficult for me to offer more advise. My ODBC testing suggested that it was difficult to get good performance, it is possible that some of the other language specific adaptors for Java or .NET might help depending on what options you have available.

Popular posts from this blog

IE9 Intranet compatibility mode in Intranet websites

Multi-select with shift on HTML table

Intersystems Caché performance, woe is me...