Exadata Storage Maintenance

    By: Andrew Meade on May 15, 2018

    Managing an Exadata Server is a great way to jump from being a normal DBA to great DMA (Database Machine Administrator), and get into the nitty-gritty details of storage administration. This tip will share some Exadata Storage maintenance jobs, how to manage them and at which logs to look.

    I support a very I/O intensive Data Warehouse that builds every night. It is fairly consistent, except when impacted by two Exadata Storage maintenance jobs: Exadata Battery Learn Cycle and the Exadata Hard Disk Scrubbing. Note that Exadata Hard Disk scrubbing is different than ASM disk scrubbing.

    Exadata Battery Learn Cycle

    The Exadata Battery Learn Cycle runs once per quarter to perform a discharge and charge of the controller battery. During the maintenance, the Flash Cache Mode changes from Write-Back to Write-Through. Write-Back Flash Cache provides the ability to both read and write I/O directly to flash disks. This is safe in case of power loss as the battery backup will allow time for the writes in the Flash Cache to be committed to the Hard Disk. In Write-Through mode, all write I/O is written directly to the Hard Disk, which is significantly slower than writing to Flash Cache first.

    Logs and schedule

    The effects of Write-Through mode can be seen in the database alert log errors under high I/O.

    ORA-27626: Exadata error: 2201 (IO cancelled due to slow/hung disk)
    NOTE: ASM has redirected some slow reads to mirror sides to improve performance.

    You can connect to the cell nodes and use list the alert history command at the CellCLI prompt.

    CellCLI> list alerthistory
    15_1 2016-10-17T04:00:31-07:00 info "The HDD disk controller battery is performing a learn cycle. Battery Serial Number : 1234 Battery Type : ibbu08 Battery Temperature : 27 C Full Charge Capacity : 1349 mAh Relative Charge : 98% Ambient Temperature : 18 C"
    15_2 2016-10-17T05:13:51-07:00 clear "All disk drives are in WriteBack caching mode. Battery Serial Number : 1234 Battery Type : ibbu08 Battery Temperature : 29 C Full Charge Capacity : 1348 mAh Relative Charge : 71% Ambient Temperature : 18 C"

    By default, the BBU learning cycle is at 2 a.m. on the 17th of every third month (Jan/April/July/Oct). This can be seen at and modified at the CellCLI prompt by running:

    CellCLI> list cell attributes bbuLearnCycleTime
    2017-04-17T02:00:00-07:00

    CellCLI> alter cell bbuLearnCycleTime='2017-04-17T02:00:00-07:00';

    Exadata Disk Scrubbing

    A subtler Exadata Maintenance job is the bi-weekly Disk Scrub. This job does not appear in the CellCLI alert history. It only appears in the $CELLTRACE/alert.log.

    Disk Scrubbing is designed to periodically validate the integrity of the mirrored ASM extents and thus eliminate latent corruption. The scrubbing is supposed to only run when average I/O utilization is under 25 percent. However, this can still cause spikes in utilization and latency and adversely affect database I/O, Oracle documentation says that a 4TB high capacity hard disk can take 8-12 hours to scrub, but I have seen it run more than 24 hours. Normally, this isn’t noticeable as it runs quietly in the background. However, if you have a high I/O workload, the additional 10-15 percent latency is noticeable.

    Logs and schedule

    The $CELLTRACE/alert.log on the cell nodes reports the timing and results.

    Wed Jan 11 16:00:07 2017
    Begin scrubbing CellDisk:CD_11_xxxxceladm01.
    Begin scrubbing CellDisk:CD_10_xxxxceladm01.

    Thu Jan 12 15:12:37 2017
    Finished scrubbing CellDisk:CD_10_xxxxceladm01, scrubbed blocks (1MB):3780032, found bad blocks:0
    Thu Jan 12 15:42:02 2017
    Finished scrubbing CellDisk:CD_11_xxxxceladm01, scrubbed blocks (1MB):3780032, found bad blocks:0

    You can connect to the cell nodes and alter the Start Time and Interval at CellCLI prompt:

    CellCLI> alter cell hardDiskScrubStartTime='2017-01-21T08:00:00-08:00';
    CellCLI> list cell attributes name,hardDiskScrubInterval
    biweekly

    ASM Disk Scrubbing

    ASM Disk Scrubbing performs a similar task to Exadata Disk Scrubbing. It searches the ASM blocks and repairs logical corruption using the mirror disks. The big difference is that ASM Disk scrubbing is run manually at disk group or file level and can be seen in V$ASM_OPERATION view and the alert_+ASM.log

    Logs and views

    The alert_+ASM.log on the database node reports the command and duration.

    Mon Feb 06 09:03:58 2017
    SQL> alter diskgroup DBFS_DG scrub power low
    Mon Feb 06 09:03:58 2017
    NOTE: Start scrubbing diskgroup DBFS_DG
    Mon Feb 06 09:03:58 2017
    SUCCESS: alter diskgroup DBFS_DG scrub power low

    Summary

    These storage maintenance tasks are not exclusive to Exadata, but rather are common to all storage vendors. A great DBA will be aware of the storage maintenance, and schedule around other high maintenance I/O activity such as RMAN backups or batch activities to keep the database running smoothly at peak performance.

     


     

    Andrew Meade has been and Oracle DBA for 15 years working in both the financial and higher education sectors. He has presented at Oracle OpenWorld and COLLABORATE. Meade is an Oracle Certified Professional for 9i, 10g and 11g.

    Released: May 15, 2018, 7:50 am
    Keywords: Department | Exadata | storage


    Copyright © 2018 Communication Center. All Rights Reserved
    All material, files, logos and trademarks within this site are properties of their respective organizations.
    Terms of Service - Privacy Policy - Contact

    Independent Oracle Users Group
    330 N. Wabash Ave., Suite 2000, Chicago, IL 60611
    phone: 312-245-1579 | email: ioug@ioug.org

    IOUG Logo

    Copyright © 1993-2018 by the Independent Oracle Users Group
    Terms of Use | Privacy Policy