Sql Server Caching

**Sysman** · 13 May 2011, 07:14

I know diddly squat about this area of SQL Server, but I have come across this argument before on non-MS platforms.

Generally speaking, those who assume that stuff that should be cached is getting cached are wrong. The only way to find out for your particular workload mix is some good old fashioned monitoring.

**mudskipper** · 13 May 2011, 19:59

If your app is on a separate box to your sql server, it makes sense to cache on the app server doesn't it?

**eek** · 13 May 2011, 20:12

Originally posted by pacharan View Post

An old chestnut this one. Have the argument in almost every. contract but still don't know the definitive answer.

In a scenario when a web app, say, references data that is fairly static in nature I always propose using some caching strategy to avoid unnecessary round trips to the database.

This always prompts some database nut (there's always one) to challenge my plan by saying that there is no need to implement this caching because Sql Server caches everything and there is no unnecessary disc IO.

In that case, why does a round trip to the DB perform so badly compared to the method that uses caching when analysed through some performance monitoring tool ? Let's assume the DB is on the same box and there are no network issues.

The only thing you save with Database caching is stopping the database accessing the disk drives.

There is still the round trip to the database, the milliseconds it takes for the database to return the static data and the additional code on the website to handle the database query to consider.

The way IIS handles caching may also be a significant advantage but as you don't explicitly state how you are caching the data its an irrelevance to the basic argument.

**MarillionFan** · 13 May 2011, 20:49

Originally posted by pacharan View Post

An old chestnut this one. Have the argument in almost every. contract but still don't know the definitive answer.

In a scenario when a web app, say, references data that is fairly static in nature I always propose using some caching strategy to avoid unnecessary round trips to the database.

This always prompts some database nut (there's always one) to challenge my plan by saying that there is no need to implement this caching because Sql Server caches everything and there is no unnecessary disc IO.

In that case, why does a round trip to the DB perform so badly compared to the method that uses caching when analysed through some performance monitoring tool ? Let's assume the DB is on the same box and there are no network issues.

I was always under the impression that cache is created in SQL Server after the query has run. So for example, select count(*) from database where user= 'Mad as a hatter' will run normally the first time. It will then cache. Run the query again as the result is cached the result is instant. Now run the same where user = 'MF is cool', the first time it needs to run, caches etc.

For BI reporting we tend to set up an automated caching routine of the most common requests so when a user run it's give a fast response.

So effectively the first user takes the hit unless you implement a proactive caching plan. (that's how I understand it to work)

**DimPrawn** · 16 May 2011, 10:19

Originally posted by pacharan View Post

An old chestnut this one. Have the argument in almost every. contract but still don't know the definitive answer.

In a scenario when a web app, say, references data that is fairly static in nature I always propose using some caching strategy to avoid unnecessary round trips to the database.

This always prompts some database nut (there's always one) to challenge my plan by saying that there is no need to implement this caching because Sql Server caches everything and there is no unnecessary disc IO.

In that case, why does a round trip to the DB perform so badly compared to the method that uses caching when analysed through some performance monitoring tool ? Let's assume the DB is on the same box and there are no network issues.

Good question.

Now, if SQL Server is on the same box as you app/webserver, then the overheard of the database access to some data cached by SQL Server is going to be small, but still larger than a local cache in the due to the opening of a TCP or named pipe connection, the creation of the TDS datastream to SQL Server, SQL Server looking up the query plan and then retrieving the cached data, converting it to TDS protocol for the results and it getting into ADO.NET objects etc and reprocessing the data readers to get the data out.

If the SQL Server instance is on another box on the network, then you can add network latency to this too.

So, the answer is, hitting the database always going to be slower and perhaps much slower than accessing a memory cache in the same address space as the IIS process and for a heavily loaded website, this is going to be siginificant.

Also, a cache can be scaled using a distributed cache, whilst a SQL Server database is a bottleneck that scales up rather than out.

This is why products such as http://www.alachisoft.com/ exist, otherwise everyone would just hit the database all the time.

**DimPrawn** · 16 May 2011, 10:27

PS.

All DBA's tell you their database server is brilliant and you don't need anything else (hell why do you need a webserver, a database is much better than a web server).

They tell you all queries are instantaneous, you can process a trillion transactions per second and it won't even use any resources it's that good.

You go along with this and write a heavily loaded web app and the database server is brought to it's knees.

The DBA's then tell you you can't hit the DB like that, don't be a moron, what sort of idiot developer are you.

**Sysman** · 16 May 2011, 11:20

Originally posted by DimPrawn View Post

PS.

All DBA's tell you their database server is brilliant and you don't need anything else (hell why do you need a webserver, a database is much better than a web server).

They tell you all queries are instantaneous, you can process a trillion transactions per second and it won't even use any resources it's that good.

You go along with this and write a heavily loaded web app and the database server is brought to it's knees.

The DBA's then tell you you can't hit the DB like that, don't be a moron, what sort of idiot developer are you.

LOL! That is so true

**fullyautomatix** · 16 May 2011, 11:53

Originally posted by MarillionFan View Post

I was always under the impression that cache is created in SQL Server after the query has run. So for example, select count(*) from database where user= 'Mad as a hatter' will run normally the first time. It will then cache. Run the query again as the result is cached the result is instant. Now run the same where user = 'MF is cool', the first time it needs to run, caches etc.

For BI reporting we tend to set up an automated caching routine of the most common requests so when a user run it's give a fast response.

So effectively the first user takes the hit unless you implement a proactive caching plan. (that's how I understand it to work)

Are you saying the results are cached ? or the query execution plan ?

I am not hot on DBA stuff but I recall that SQL server loads a lot of data pages into RAM for quick access and the size of this cache in RAM depends on how big the RAM is.

**eek** · 16 May 2011, 12:41

Originally posted by fullyautomatix View Post

Are you saying the results are cached ? or the query execution plan ?

I am not hot on DBA stuff but I recall that SQL server loads a lot of data pages into RAM for quick access and the size of this cache in RAM depends on how big the RAM is.

He doesn't say but if he thinks its the result set he is wrong as its only the execution plan (although there is documentation that says otherwise).

SQL server will however use whatever memory it has available to keep recently used data in memory so its easy to get confused. The best documentation I can see regarding the life cycle of a cached query plan states that it becomes invalid if the table is changed or a lot of data is entered so I suspect that the real answer is that the core of the query is stored in memory.

Either way the original argument for using sql server is irrelevant. If you want to cache information you need to do it as close to the system managing the caching as possible. When stackoverflow first started they discovered that simply caching all requested information for 2 seconds was enough to more than half the load on the single server they used for everything at the time.

Sql Server Caching