HBA/RAID controller

From HPCWIKI
Jump to navigation Jump to search

HBA, RAID and SAS Expander

  • HBA is just a controller that provides an expansion of SAS/SATA ports. What manages the drives is up to the OS.
  • A RAID Controller is like an HBA but has the onboard functionality to create a array which is then presented to the OS as a singular drive
  • SAS Expanders can be used to maximize the # of storage capability of your HBA or SAS controller card. SAS expanders can be used with RAID controllers and SAS HBAs.

Native and Maximum supported disks

There is a concept in RAID controller, that are Native Supported Disks and Maximum Supported Disks.

  • Native supported disk # means the number of disks that can be direct connected to the RAID controller using brakeout cable
  • Maximum supported disks # means when to use of port expanders such as Intel RES2SV240

Controller Interface[1]

The RAID controller has an interface that connects to the storage drive and an interface that connects to the

CPU. The drive interfaces are Serial Attached SCSI (SAS), Serial Advanced Technology Attachment (SATA),

and Non-Volatile Memory Express (NVMe). NVMe is a communication protocol designed for flash storages

that use Peripheral Component Interconnect Express (PCIe) for connectivity.

Storage drive side interface Communication

band

Theoretical throughput Effective throughput (90%)
SATA 6G 6 Gbps 572 MiB/s 515 MiB
SAS 12G 12 Gbps 1,144 MiB/s 1,030 MiB/s
NVMe Gen3 8 Gbps x4 3,756 MiB/s 3,380 MiB/s
NVMe Gen4 16 Gbps x4 7,512 MiB/s 6,760 MiB/s
CPU side

Interface

Number

of lanes

Communication

band

Theoretical throughput Effective throughput (90%)
DMI Gen3 x4 8 Gbps x4 3,756 MiB/s 3,380 MiB/s
PCIe Gen3 x8 8 Gbps x8 7,512 MiB/s 6,760 MiB/s
PCIe Gen3 x16 8 Gbps x16 15,024 MiB/s 13,520 MiB/s
PCIe Gen4 x8 16 Gbps x8 15,024 MiB/s 13,520 MiB/s
PCIe Gen4 x16 16 Gbps x16 30,048 MiB/s 27,040 MiB/s

*The theoretically achievable throughput is calculated by subtracting 1.54% redundancy with 128b/130b coding.

The actual achievable throughput can be estimated by multiplying this value by 0.90.

Expected Performance by RAID type

RAID level Type Random read Random write Sequential read

transfer

performance

Sequential write

transfer

performance

Features[2]
RAID0 Stripe 1 1 N x SR N x SW Ideal for applications that require high

bandwidth but do not require fault tolerance

RAID1 Mirror 1 2 N x SR N x SW / 2 Ideal for any application that requires fault tolerance and minimal capacity.
RAID10 Mirror + Stripe 1 2 N x SR N x SW / 2 Ideal for fault tolerance
RAID5 Single Parity 1 4 N x SR (N-1) x SW Idle for fault tolerance with limited overhead
RAID50 support to sustain one drive failure per RAID 5 drive group and still maintain data integrity
RAID6 Double Parity 1 6 N x SR (N-2) x SW Idle for fault tolerance with limited overhead

Not well suited to tasks requiring lot of writes due to two sets of parity data for each write operation.

RAID60 support to sustain two drive failures per RAID6 drive group and still maintain data integrity
Where, N: Number of drives, SR: Single read performance, SW: Single write

RAID Configuration Strategies

You cannot configure a virtual drive that optimizes all three factors, but it is easy to choose a virtual drive configuration that maximizes one factor at the expense of another factor

Purpose Strategies Spare drives
Virtual drive availability (fault tolerance) RAID 1 (mirroring) provides excellent fault tolerance, but

requires a redundant drive

The hot spare automatically takes its place and the data on the failed drive is rebuilt on the hot spare.

Hot spares can be used for RAID levels 1, 5, 6, 10, 50, and 60.

Virtual drive performance
  • RAID 0 (striping) offers excellent performance
  • RAID 00 (striping in a spanned drive group) offers excellent performance
  • RAID 5 provides high data throughput, especially for large files
  • RAID 6 works best when used with data that requires high reliability, high request rates, and high data transfer. It provides high data throughput, data redundancy, and very good performance. However, RAID 6 is not well suited to tasks requiring a lot of writes.
Virtual drive capacity RAID 0 provides maximum storage capacity for a given set of drives
RAID_table[3]

Performance Index - Throughput / Transaction

  • Throughput [MiB/s] - Data transfer amount per second (in megabytes)
  • Transaction [IO/s] - IO processing per second
  • Latency [ms] - Average response time (in milliseconds)

"Data Throughput” value used to load profiles with sequential access pattern, while “Transaction Rate”

value used to load profiles with random access pattern. Throughput and transaction are in direct

proportion to each other and can be calculated mutually using the follow

  • Data throughput [KiB/s] = Transaction rate [IO/s] x Block size
  • Transaction rate [IO/s] = Data throughput [KiB/s] / Block size [KiB]

Controller information on Linux

# Find controller information
$sudo lspci -vv | grep -i raid

# Find raid has been configured or not and check the output
$cat cat /proc/scsi/scsi

BBU vs CacheVault vs CacheCade[4]

Battery Backup Unit (BBU)

The BBU’s job is to remember the data that hasn’t been synced to disk yet, usually up to 72 hours without power. When the machine powers back up, the BBU will write the cache contents on the disk.

If your “write cache” option is set to “write through” or “off”, then you should be fine without a BBU on your raid card. The downside to having “write cache” turned off is the RAID performance will be sub-optimal. Many RAID cards will have the write cache setting “on” only if the BBU is installed. This is very important for users who frequently save their database from certain types of corruption or need high data integrity.

CacheVault Cache Protection

CV does exactly what the BBU does, but is a newer technology that enables other features. What makes CacheVault superior in this aspect is the ability for that data to be moved from DRAM to NAND flash. When the data is moved from DRAM to NAND flash, it can be stored there for up to 3 years! When the server turns back on, data is moved from NAND back to DRAM and then written to the disks. Operations then continue as if nothing ever happened.[5]

CacheCade SSD Cache

While BBU and CacheVault are both physical module add-ons, CacheCade is a RAID controller software that enables an SSD Read/Write cache for the array. It allows you to optimize the existing HDD arrays with the SSD-based flash cache.

CacheCade will create a front-side flash cache for the “hottest” array. This flash cache reads/writes to the SSD which is much more efficient than reading/writing to the HDD array. CacheCade content remains intact upon reboot.

References