9i Real Application Clusters 캐시 퓨전 복구 이해(Doc ID 144152.1)

17385 단어
PURPOSE
-------

The purpose of this document is to explain the benefits and functionality of 
fusion recovery in an Oracle Real Application Clusters Environment. 

 
SCOPE & APPLICATION
-------------------

This document is intended for Oracle Real Application Clusters database 
administrators that would like to understand how fusion recovery works
and how it can increase availability on their clustered database.


Crash/Instance Recovery for Cache Fusion
----------------------------------------

Because of the possible existance of past images in remote buffer caches, instance 
or crash recovery is handled differently in a RAC environment than in previous 
versions.  The major differences are that thread recovery of failed instance(s) are
done by a surviving instance's SMON process instead of a foreground process.  The 
second major change is that during bounded instance and crash recovery (which
introduces a two-pass log read during thread recovery) SMON eliminates BWR's (block
written redos) from the recovery set.  This enhancement should speed up recovery time 
if there were existing past images.  So, if an instance fails:

        1. The instance, or instances, dies.
        2. Failure is detected by cluster manager or CGS.
        3. Reconfiguration occurs and all locks owned by the departing instance are 
           remastered (see Note 139435.1 for more info) and the first pass read of 
           threads of failed instances done by SMON.
        4. SMON claims locks needed to recover blocks found by the first pass read.
        5. Locks are obtained and second pass of redo threads of failed instances
           is performed and blocks become available as they have been recovered.

After an instance dies and the failure is detected, the SMON process of a surviving
instance will start the first pass log read of the failed instance's redo thread.  
SMON will merge the redo thread ordered by SCN to ensure that changes are written in 
an orderly fashion.  SMON will also find BWR (block written records) in the redo stream
and remove entries that are no longer needed for recovery because they were past 
images of blocks already written to disk.  The final product of the first pass log
read is a recovery set that only contains blocks modified by the failed instance
with no subsequent BWR to indicate that the blocks were later written.  Each entry
in the recovery list is ordered by first-dirty SCN to specify the order to acquire
instance recovery locks.  The recovering SMON process will then inform each lock 
element's master node for each block in the recovery list that it will be taking
ownership of the block and lock for recovery.  This is handled differently depending
on ownership of the lock element as described below:


Case 1: LE not open (or in NL0 mode) on recovering instance, no other instances own 
lock element:

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |               |     |               |    |                |
    ----------------      -----------------     -----------------  

Action: Acquire lock element in XL0 mode, read block from disk, and apply redo 
changes then DBWR will write out recovery buffer when complete:

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |      XL0      |     |               |    |                |
    ----------------      -----------------     -----------------  
            |
 keep block in recovery list


Case 2: LE not open (or in NL0 mode) on recovering instance, other instance has LE 
in SL0 or XL0 mode:  

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |               |     |      XL0      |    |                |
    ----------------      -----------------     -----------------  

Action: No recovery needed because a current copy of the buffer already exists on 
another instance, remove block entry from recovery set.  

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |               |     |      XL0      |    |                |
    ----------------      -----------------     -----------------  
            |
 remove block from recovery list


Case 3: LE not open (or in NL0 mode) on recovering instance, other instance has LE 
in SG# or XG#:

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |               |     |      XG0      |    |                |
    ----------------      -----------------     -----------------  

Action: Initiate write of current block, no recovery needed because a current copy of 
the buffer already exists on another instance, remove block entry from recovery set.   
Write completion will release recovery buffer and lock as usual:

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |      NG1      |     |               |    |                | 
    ----------------      -----------------     -----------------  
            |                        |
            |                     write block to disk
 remove block from recovery list    


Case 4: LE not open (or in NL0 mode) on recovering instance, other instance has LE 
in NG1.  

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |               |     |      NG1      |    |                |
    ----------------      -----------------     -----------------  

Action: Get consistent read image of latest past image based on SCN, apply redo
changes and write out recovery buffer when complete.

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    | acquires XG0  |     |      NG1      |    |                | 
    ----------------      -----------------     -----------------  
            |                         |
            |                  send CR block to recovering instance     
 keep block in recovery list    


Case 5: LE open in recovering instance in SL0 or XL0, other instance has no lock.

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |      XL0      |     |               |    |                |
    ----------------      -----------------     -----------------  

Action: No recovery needed because a current copy of the buffer already exists on 
another instance, remove block entry from recovery set.  

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |      XL0      |     |               |    |                |
    ----------------      -----------------     -----------------  
            |
 remove block from recovery list


Case 6: LE open in recovering instance in SG# or XG#, other instance doesn't matter:

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |      XG0      |     |      NG1      |    |                |
    ----------------      -----------------     -----------------  

Action: Initiate write of current block, no recovery needed on recovering instance. 
Release recovery buffer and decrement past image count when block write completes.

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |      XG0      |     |      NG1      |    |                |
    ----------------      -----------------     -----------------  
            |
 write block to disk
 remove block from recovery list


Case 7: LE open in recovering instance in NG1 mode, other instance has LE in SG# or
XG# mode.

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |      NG1      |     |      XG0      |    |                | 
    ----------------      -----------------     -----------------  

Action: Initiate write of current block on remote instance, no recovery needed on 
recovering instance.  Release recovery buffer and decrement past image count when
block write completes:

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |      NG1      |     |      XG0      |    |                | 
    ----------------      -----------------     -----------------  
            |                        |
            |                     write block to disk
 remove block from recovery list    


Case 8: LE open in recovering instance in NG1 mode, other instance has LE in NG# 
mode: 

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    |      NG1      |     |      NG0      |    |                | 
    ----------------      -----------------     -----------------  

Action: Get consistent read copy of block from highest past image based on SCN.  
Apply redo changes and write out recovery buffer when complete:

    ----------------      -----------------     -----------------  
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    | 
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  | 
    | acquires XG1  |     |      NG0      |    |                | 
    ----------------      -----------------     -----------------  
            |                         |
            |                  send CR block to recovering instance     
 keep block in recovery list    


After the above operation the recovering instance should have locks on every block
in the recovery set.  Other instances will not be able to acquire these locks until 
the recovery operation is completed.  When blocks are cached for recovery, instance 
recovery buffers cannot be replaced or aged out except by another recovery buffer 
request.  At this point the second pass log read and redo application can begin.  
When the second pass log read begins again redo threads for failed instances are 
merged by SCN and the redo is applied to the datafiles.  

Instance Recovery Failure Scenerios:

        o If recovery fails without the death of the recovering instance instance 
          recovery will be restarted.

        o If the recovering instance dies, a surviving instance (if one exists) will
          acquire the instance recovery enqueue and start recovery.  Crash recovery
          will be necessary if all instances are down.

        o If a non-recovering instance fails, SMON will abort recovery, release the
          IR enqueue, and the next live instance will re-attempt instance recovery.

        o If there are I/O errors the file is taken offline and instance recovery
          is restarted.  If the file is the system datafile the recovering instance
          will crash; eventually all instances in the cluster will go down and 
          media recovery will be required.

        o If block corruption is encountered during redo application online block
          recovery will attemp to clean up the block in order for instance recovery
          to proceed.  


Online Block Recovery for Cache Fusion
--------------------------------------

When a data buffer becomes corrupt in an instance's cache, the instance will 
initiate online block recovery.  Block recovery will occur if either a foreground 
process dies while applying changes or an error is generated during redo application.  
In the first case, PMON initiates block recovery and in the second case the 
foreground process initiates block recovery.  Online block recovery consists of 
finding the block's predecessor and applying redo changes from the online logs of the 
thread in which corruption occurred.  The predecessor of a fusion block is its most 
recent past image.  If there is no past image then the block on disk is the 
predecessor.  For non-fusion blocks, the disk copy is always the predecessor.

If the LE of the block needing recovery is held in XL0 status then the predecessor
will be located on disk.

If the LE of the block needing recovery is held in XG# status then the predecessor
will exist in another instance's buffer cache.  The instance with the highest SCN PI 
image of the block will send a consistent read copy of the block to the recovering 
instance.


Media Recovery for Cache Fusion
-------------------------------

Cache fusion does not impact the existing mechanism for media recovery.


RELATED DOCUMENTS
-----------------

Note 139436.1 - Understanding 9i Real Application Clusters Cache Fusion
Note 139435.1 - Fast Reconfiguration in 9i  Real Application Clusters

좋은 웹페이지 즐겨찾기