Troubleshooting and Resolving Latch: Cache Buffers Chains

I received  complaint
from one of my client that their application is not
able to connect to database and it is hanging!!! On checking the database found that the maximum session and processes limit had exceeded. So we tried to
release few sessions from the database. However, these sessions were not getting
released. 
When checked why the sessions are not getting
released found that these session were being blocked. The event these sessions
had was latch: cache buffers chains. 


Latch : Cache buffer chains

The latch: cache buffers chains is an Oracle metric used to
protect a buffer list in the buffer cache. These latches are used when
searching for, adding, or removing a buffer from the buffer cache. Contention
on this latch usually means that there is a block that is greatly contended for
(known as a hot block). 

Common Causes

Cache buffers chains latch contention is due to some blocks being visited and re-visited way too much by a query execution. This
usually happens due to nested loops joins or FILTER loops retrieving many rows
from their outer (driving) row sources and then visiting the inner row-source
again for each row from driving row source. 
Findings:
I checked the current latch status in the database
I took out the sql’s and their objects id that were causing this contention.
I also checked the latch address for these object id’s.

From the object id’s I was able to find the objects and the block numbers that were the hot blocks and causing this contention:


Execution Plan Check

I checked the execution plan of the above sql id’s, and found that the queries are performing lots of nested loop operations and filter loop joins. 
Below is the execution plan for one the sql.
On checking the status of the status of the statistics for the tables and indexes found that they had gone stale. 
After gathering statistics the query execution plan changed and the contention was reduced. 
Resolution in my case

This issue was resolved after gathering statistics for the tables involved in the query which changed the execution plan of the query by using hash joins instead of nested loops.  

The Latch: cache buffers chains can be resolved by one of the following

1. Fix the execution plan of the query (By getting hash join instead of the nested loop)
  1. If statistics are stale gather them to get proper execution plan.
  2. Check with application team to tune the query.
  3. Force has join hint.
2. If the the hot blocks are due to index root block or branch blocks this can be resolved by the application design changes as given below:

  1. Consider partitioning the table and using local indexes. This might allow you to spread the heat amongst multiple indexes (you
    will probably want to use a hash partition to ensure an even spread of load
    amongst the partitions).
  2. Consider
    converting the table to a hash cluster keyed on the columns of the index.
      This allows the index to be bypassed
    completely and may also result in some other performance improvements. However,
    hash clusters are suitable only for tables of relatively static size and
    determining an optimal setting for the SIZE and HASHKEYS storage parameters are
    essential.

I have used the latchprofx.sql and snapper.sql scripts to troubleshoot the above issue. These scripts can be obtained from http://tech.e2sn.com/oracle-scripts-and-tools.

Leave a Comment