Frequently Asked Questions
                                        RAID

v1.0 

This FAQ is maintained by Leo Langevin (llangevi@mcs.com)
and can be found at ftp://ftp.mcs.com/mcsnet.users/llangevi/VSE/text/RAID.FAQ

Thanks to the EMC2 Corporation for providing some of the data that follows.
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

1.0 Some of the terminology used in this FAQ
2.0 What is RAID, and how did is develop ?
3.0 What are all of those RAID Levels ?
    3.1 RAID 0
    3.2 RAID 1
    3.3 RAID 2
    3.4 RAID 3
    3.5 RAID 4
    3.6 RAID 5
    3.7 RAID 6
4.0 What RAID devices are available for the mainframe?

<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

1.0 Some of the terminology used in this FAQ
--------------------------------------------
    * Disk Array -
      a collection of disks presented as one or 
      more virtual disks to the host.

    * Fault Tolerant -
      no single point of failure that would
      result in data availability

    * RAID - 
      [R]edundant [A]rray of [I]nexpensive [D]isks

    * SLED -
      [S]ingle [L]arge [E]xpensive [D]isk

    * MTBF -
      [M]ean [T]ime [B]efore [F]ailure

    * MTBDU -
      [M]ean [T]ime [B]efore [D]ata [U]navailability

    * Synchronized Disks - Multiple disks are being used to
      write a single file.  This means that several records
      may be on one pack, while other records may be stored
      on another pack.  This is a function of the disk controller.

    * Non-Synchronized Disks - When a file is being written
      it will only go to a single device.  In the case of
      VSE operating systems, multiple EXTENT statements in
      a file definition may still be using non-Syncronized
      disks.  Using multiple disks is the function of the
      disk controller, not the operating system.

    * Write Penalty - With non-Synchronized packs, there is
      usually a delay because oftentimes there are multiple
      requests going on concurrently to access the same 
      physical device.  This means a single-threaded serving
      of requests.  A write penalty is the delay that occurs
      when attempting to write to a pack that is not yet
      available.  Often times, OEM vendors will overcome this
      problem by providing some sort of assist (such as "fast-write"
      features) by writting to cache, and delaying the actual
      write until that device becomes available but without
      delaying the requesting program.  

2.0 What is RAID, and how did is develop ?
------------------------------------------
The concept of RAID originated from a paper published from
the University of California at Berkley.  It was titled
     "A Case for Redundant Arrays of Inexpensive Disks" - 1987

This paper proposed that multiple smaller disks replace the
use of a single large disk because of:
     - LOWER COST
     - Equivilant Performance
     - Equivilant Data Availability
   

RAID was originally used in small system environments, where
performance was less than SLEDs, and disk capacity was
smaller.  In the past few years, however, there have been
rapid advances in small disk technology, while SLED
technology advancment has slowed, and (in some areas) stopped.

For example, some disk drives have/had the following measurements:

             Capacity    Seek Time    Latency   MTBF Hours   Cost/MB
-----------------------------------------------------------------
SLED  1986   1.89 GB       16ms       8.3 ms     25,000      $20-23
Small 1986    100 MB       30ms      20.0 ms        400      $6-10

SLED  1994   2.83 GB       15ms       7.1 ms  3,000,000+     $4-10
Small 1994      9 GB       12ms       5.6 ms    500,000+     $4-10

Future directions are new serial I/O architectures, such as IBM's
serial storage architecture and Seagate's fibre channel.  In late
1995 to early 1996 expect to see:

        Fault Tolerant
        High Performance 
        - IBM SSA at 80 BM/sec.
        - Seagate Fibre Channel 100 MB/sec.
        RAID functions built at the HDA level
        EMC fibre channel/SSA RAID at the HDA level (available today)

As a side note, the MTBF measurements are different between 
SLEDs and small disks, devices and may actually be different 
than reality.  Small disk measurements come from OEM vendor 
recommended replacement times rather than documented cases
of failure rates. 


3.0 What are all of those RAID Levels?
--------------------------------------

There are different levels of RAID.  They are usually referred to
as RAID 0, RAID 1, RAID 2, and so forth.  Each version of RAID
was introduced to provide the equivilant capacity and performance
of many small disks and actuators as those that the SLED users
were used to enjoying.  With lots of small disks and lots of
actuators, the performance improved.

The problem was, however, that the MTBF of an array was very low.
The more disks in an array, the more likely a disk would fail, 
causing an array failure,  

Finally, a solution to this problem was made - by using Check/Parity
disks, an area could be discovered to be in error while it was
being written, and a different area could be used, thus providing
a greater level of reliability, and a higher MTBF.


3.1 RAID 0
----------
DEFINITION: This level is also known as disk striping.  It is where 
multiple disks are used to improve some performance, but there 
is not logic to protect/recover data.  Syncronized disks are used.

BENIFITS: There is a high degree of performance for doing
large blocks of data I/O, since the load is spread across the
actuators.  

PROBLEMS: There is a low degree of performance for transaction
processing because of too many requests going to the same 
actuator for the processing of small amounts of data. 


3.2 RAID 1
----------
DEFINITION: This is known as mirroring, where data is written
to two different disks at the same time, and data can be read
from either disk, based on device availability.  Non-syncronized
disks are being used.

BENIFITS: This provides the highest degree of availability.
If a drive goes bad, it's mirrored copy is still available.
This will also provide the highest degree of performance, since
is one actuator is in use, the other can be used.

PROBLEMS: The cost requires twice as many disk drives to be
purchased.  If you have 100 gigabytes of DASD, you would
need to have 200 gigabytes available, of which 1/2 of it 
would be owned by the system, while the other half would 
be used by the application programs.


3.3 RAID 2
----------
DEFINITION: This level uses multiple disks as did RAID 0, but
a small percentage of thoses disks were set aside to be 
"check disks," where a special Hamming Error Correction Code
was used to set some bits on the check disks.  Synchronized
disks are being used.

BENIFITS: Data can be read at high transfer rates because
of the data be spread across multiple actuators.  Data
being read would not normally need to access the check disk data.

PROBBLEMS: Because the check data resides on a small number
of disk, and because every record being written would 
need to access these check disks, there is a bottleneck
introduced, and the rate of writting data can be very low.


3.4 RAID 3
----------
DEFINITION: This level uses multiple disks as did RAID 2, but
only a single parity disk is necessary to maintain data
integrity.  Synchronized disks are being used.

BENIFITS: Data can be read at high transfer rates because
of the data be spread across multiple actuators.  Data
being read would not normally need to access the parity disk.
Fewer disks are required, as compared to RAID 1 and RAID 2.

PROBLEMS: Because the parity data resides on a single disk, 
and because every record being written would need to access 
the parity disks, there is an even greater bottleneck than
with RAID 2.  The rate of writting data will be very low.
Also, if the parity disk goes bad, all data integrity is lost.


3.5 RAID 4
----------
DEFINITION: This level uses multiple disks as did RAID 2, but
only a single parity disk is necessary to maintain data
integrity.  Unlike RAID 3, Non-synchronized disks are being used.

BENIFITS:  Since the read access has slowed down, there is
little bottleneck for the parity disk, and therefore there
is a high I/O rate for writing data to the parity disk.

PROBLEMS:  Non-synchronous disks are slow when it comes to
reading parity or data because of the possibility of actuator 
contention. If the parity disk goes bad, all data integrity 
is lost.  A write penalty occurs at this level of RAID for 
writing data or parity information as well as removing old
data contributions to parity or generating a new parity. 


3.6 RAID 5
----------
DEFINITION: This level uses multiple disks as did RAID 4, but
instead of a single parity disk to maintain data integrity, 
an area of each disk is carved out to store the parity
information for that specific disk.  As with RAID 3, 
Non-synchronized disks are being used. 

BENIFITS:  Since the read access has slowed down, there is
little bottleneck for the parity disk, and therefore there
is a high I/O rate for writing data.  Other disks are not
affected by the loss of the parity data on that disk.

PROBLEMS:  Non-synchronous disks are slow when it comes to
reading the parity or the data because of the possibility 
of actuator contention. A write penalty occurs at this 
level of RAID for  writing data  or parity information as 
well as removing old data contributions  to parity or 
generating a new parity. 


3.6 RAID 6
----------
DEFINITION: This level uses multiple disks as did RAID 5, but
instead of each disk having it's parity information maintained
on it's own pack, the parity information is spread across to
every other pack's parity area except the pack that it is 
located on.  Non-synchronized disks are being used.

BENIFITS:  Since the read access has slowed down, there is
little bottleneck for the parity disk, and therefore there
is a high I/O rate for writing data.  Parity data can be
rebuilt if a single disk goes bad by comparing all of the 
other disks parity data and doing a "who's missing" logic.
This means that when a blank disk replaces a bad disk, the
missing data can be rebuilt.

PROBLEMS:  Non-synchronous disks are slow when it comes to
reading data and parity records because of the possibility of 
actuator contention. the number of parity information is 
doubled across disks, and there for a write penalty occurs 
for this as well as forwriting new data, generating or rewriting 
each pair of parity records, or removing old data contributed 
to each parity record.


4.0 What RAID devices are available for the mainframe?
------------------------------------------------------
(The following was contibuted by EMC2 Corporation)

IBM - RAMAC (all RAID implementations)  Maximum of 180 GB total DASD.
      3.5" HDA's at 2GB are used.  3+1 RAID 5+ Arrays.
      Drawer Cache Assisted.

STK - ICEBERG 5.25" HDA's at 1.6GB are used.  13+2 RAID 6+ Arrays.
      Log File Structure Assisted.

EMC - SYMETRIX. 3.5 HDA's.  Uses EMC's exclusive MOSAIC-2000
      architecture to implement fibre channel/SSA RAID
      functionality at the HDA level using technology that is
      available today.


============================================================

Leo Langevin
mailto: llangevi@mcs.com