ePC Home   Back 

Providing help and advice with TeleForm and your data capture system

 

Server Storage



When configuring the main hard drives for your data capture server there are many different options to consider.

The first and most crucial consideration if fault tolerance. As it is impractical in most cases to run full backups of TeleForm servers (TeleForm backup FAQ) it is essential that the most common causes of data loss are planned for.

Modern hard drives rotate at anything from 5,400RPM for desktop ATA (otherwise known as IDE) hard drives up to 15,000RPM for server SCSI hard drives. This means that in every hard drive there are several metal platters spinning at incredible speeds 24 hours a day, 7 days a week for years on end with little interruption. To put this into perspective, imagine placing your foot to the floor on the accelerator pedal of any modern car in neutral and holding it there for three years. This will cause the engine to rotate at approximately 6,000RPM, which is similar to many low end hard drives. The question is not so much “if” it will break, more “when” it will break. Hard drives are exactly the same and although manufacturers have engineered the drives to last reliably for the few years they are typically in service for, they are still the one component in a server most likely to fail.

When a drive breaks it usually suffers complete failure within minutes or seconds which means that all data stored on the drive is lost. Even if you have an up to date backup it will typically take hours or even days to repair the server, re-install the operating system and restore the backup. For this reason, we consider a fault tolerant configuration of hard drives an essential part of any server.

Fault Tolerance?

It is possible to configure hard drives in a server (or even a workstation) so that if a drive fails the system is able to continue operation unhindered. This is commonly referred to as a RAID configuration.

RAID 1

The most basic type of RAID is RAID 1, which means two hard drives of equal capacity are mirrored so that each drive contains exactly the same information. In a RAID 1 configuration if either drive fails the other one will continue to work, allowing the server to keep running. The drawback to RAID 1 is that you have to buy two drives at double the cost and loose 50% of the total storage capacity.

When a drive fails you simply replace it as soon as you can source another drive and with most modern hot swap systems you don’t even need to re-start the server. The server will then re-create the mirror on the new drive and the system is fault tolerant once more.

RAID 5

The much more complex RAID 5 requires three or more drives. It works by splitting the data across all but one of the drives and then writing parity information to the last drive. If for example you have three drives, it will write the data to drives one and two and then write the parity information to drive three. The parity information is written to a different drive each time, so the next set of data will be split across drives two and three with the parity information sent to drive one.

The parity information is basically the key to calculating a missing part of data. For example with 1 + 3 = 4 it is possible to calculate any of the numbers if only one is missing. If you consider each number (or chunk of data) to be stored on one of the three drives in a RAID 5 array the server can rebuild the data if any of the three drives fails.

Number of Disks Data Missing Data
Three disk RAID 5
  Disk 1 Disk 2 Parity Disk

Data

1 Failed 4
3
Six disk RAID 5
  Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Parity Disk

Data

1 5 2 Failed 9 24
7


One main advantage with RAID 5 is that you loose a much smaller amount of space to the fault tolerance. RAID 1 looses 50% where as RAID 5 only looses up to 33% or less depending on the number of drives. Also you can create RAID 5 arrays of much larger capacity than would be possible with one drive by simply adding more and more drives to the array.

The drawback to RAID 5 is that there is an overhead for the server to calculate the parity information every time data is written to the drives. Due to the way the data is split, there is also a performance loss, especially for large files, when reading the data back from the array.

RAID 1 V RAID 5

A few years ago the largest drives available where only 36GB in size. RAID 5 was ideal for creating large volumes as you could use four 36GB drives to create a 108GB RAID 5 array. Now that 500GB drives are commonplace RAID 5 is much less useful. Two 500GB drives in a RAID 1 array would be cheaper, simpler to manage and faster than three 250GB drives in a RAID 5 array.

Pushing the Performance Boundaries with RAID 0

In order to increase performance, it is possible to use RAID 0 which requires two disks and splits the data across them. Half the data is written to disk one, whilst simultaneously the second half of the data is being written to disk two. This is also commonly referred to as striping. The theoretical speed is double that of the component drives and although the increase in speed might not reach the theoretical maximum it is still a significant performance boost.

To test this we performed some tests with some 10,000RPM Western Digital Raptor SATA drives and 8GB of data consisting of varying file sizes to simulate real world usage:

 

RAID 1 MB/Sec RAID 0 MB/Sec Speed Increase
40.8 64.3 58%


Whilst RAID 0 doesn’t waste any space, as with RAID 1 and 5, its huge drawback is that it doesn’t provide any fault tolerance and it doubles the chance of fault occurring! It is spreading your eggs in to two baskets but if either are dropped you loose the whole lot! Whilst this might be acceptable for a high performance workstation it is less than acceptable in a server.

RAID 10 to the Rescue

By coupling RAID 1 and RAID 0 you can create a volume that is fault tolerant AND faster then its component drives, this is called RAID 10 (or RAID 1+0). To do this you need four or more drives and you loose 50% of your storage capacity as with RAID 1. You start by creating two RAID 1 mirrored volumes and then use RAID 0 to stripe them.  This provides the optimum level of fault tolerance and speed.

Glossary


ATA - Advanced Technology Adaptor.  The most common type of hard drive interface, commonly used today in workstations and entry level servers.


IDE -  Integrated Drive Electronics. Another name for the ATA interface.

SCSI -  Small Computer Systems Interface. A very high performance interface usually used in high end servers.

RAID -  Redundant Array of Inexpensive Disks. This provides a way of grouping two or more disks in varying configurations to provide tolerance should any one of the drives fail.

Hot Swap -  The ability to remove and replace a component while the server is still switched on.

 

SATA - Serial ATA.  A new, higher performance, hot swappable drive interface now common on workstations and low end servers.

Note to editors: Please feel free to reproduce any of these documents but we do request that you credit ePartner Consulting Ltd and put a link back to www.epc.co.uk on any web site that they are used on. Thank you.