HBA/RAID controller
HBA, RAID and SAS Expander
- HBA is just a controller that provides an expansion of SAS/SATA ports. What manages the drives is up to the OS.
- A RAID Controller is like an HBA but has the onboard functionality to create a array which is then presented to the OS as a singular drive
- SAS Expanders can be used to maximize the # of storage capability of your HBA or SAS controller card. SAS expanders can be used with RAID controllers and SAS HBAs.
Native and Maximum supported disks
There is a concept in RAID controller, that are Native Supported Disks and Maximum Supported Disks.
- Native supported disk # means the number of disks that can be direct connected to the RAID controller using brakeout cable
- Maximum supported disks # means when to use of port expanders such as Intel RES2SV240
Controller Interface[1]
The RAID controller has an interface that connects to the storage drive and an interface that connects to the
CPU. The drive interfaces are Serial Attached SCSI (SAS), Serial Advanced Technology Attachment (SATA),
and Non-Volatile Memory Express (NVMe). NVMe is a communication protocol designed for flash storages
that use Peripheral Component Interconnect Express (PCIe) for connectivity.
Storage drive side interface | Communication
band |
Theoretical throughput | Effective throughput (90%) |
---|---|---|---|
SATA 6G | 6 Gbps | 572 MiB/s | 515 MiB |
SAS 12G | 12 Gbps | 1,144 MiB/s | 1,030 MiB/s |
NVMe Gen3 | 8 Gbps x4 | 3,756 MiB/s | 3,380 MiB/s |
NVMe Gen4 | 16 Gbps x4 | 7,512 MiB/s | 6,760 MiB/s |
CPU side
Interface |
Number
of lanes |
Communication
band |
Theoretical throughput | Effective throughput (90%) |
---|---|---|---|---|
DMI Gen3 | x4 | 8 Gbps x4 | 3,756 MiB/s | 3,380 MiB/s |
PCIe Gen3 | x8 | 8 Gbps x8 | 7,512 MiB/s | 6,760 MiB/s |
PCIe Gen3 | x16 | 8 Gbps x16 | 15,024 MiB/s | 13,520 MiB/s |
PCIe Gen4 | x8 | 16 Gbps x8 | 15,024 MiB/s | 13,520 MiB/s |
PCIe Gen4 | x16 | 16 Gbps x16 | 30,048 MiB/s | 27,040 MiB/s |
*The theoretically achievable throughput is calculated by subtracting 1.54% redundancy with 128b/130b coding.
The actual achievable throughput can be estimated by multiplying this value by 0.90.
Expected Performance by RAID type
RAID level | Type | Random read | Random write | Sequential read
transfer performance |
Sequential write
transfer performance |
Features[2] |
---|---|---|---|---|---|---|
RAID0 | Stripe | 1 | 1 | N x SR | N x SW | Ideal for applications that require high
bandwidth but do not require fault tolerance |
RAID1 | Mirror | 1 | 2 | N x SR | N x SW / 2 | Ideal for any application that requires fault tolerance and minimal capacity. |
RAID10 | Mirror + Stripe | 1 | 2 | N x SR | N x SW / 2 | Ideal for fault tolerance |
RAID5 | Single Parity | 1 | 4 | N x SR | (N-1) x SW | Idle for fault tolerance with limited overhead |
RAID50 | support to sustain one drive failure per RAID 5 drive group and still maintain data integrity | |||||
RAID6 | Double Parity | 1 | 6 | N x SR | (N-2) x SW | Idle for fault tolerance with limited overhead
Not well suited to tasks requiring lot of writes due to two sets of parity data for each write operation. |
RAID60 | support to sustain two drive failures per RAID6 drive group and still maintain data integrity | |||||
Where, N: Number of drives, SR: Single read performance, SW: Single write |
RAID Configuration Strategies
You cannot configure a virtual drive that optimizes all three factors, but it is easy to choose a virtual drive configuration that maximizes one factor at the expense of another factor
Purpose | Strategies | Spare drives |
---|---|---|
Virtual drive availability (fault tolerance) | RAID 1 (mirroring) provides excellent fault tolerance, but
requires a redundant drive |
The hot spare automatically takes its place and the data on the failed drive is rebuilt on the hot spare.
Hot spares can be used for RAID levels 1, 5, 6, 10, 50, and 60. |
Virtual drive performance |
| |
Virtual drive capacity | RAID 0 provides maximum storage capacity for a given set of drives |
Performance Index - Throughput / Transaction
- Throughput [MiB/s] - Data transfer amount per second (in megabytes)
- Transaction [IO/s] - IO processing per second
- Latency [ms] - Average response time (in milliseconds)
"Data Throughput” value used to load profiles with sequential access pattern, while “Transaction Rate”
value used to load profiles with random access pattern. Throughput and transaction are in direct
proportion to each other and can be calculated mutually using the follow
- Data throughput [KiB/s] = Transaction rate [IO/s] x Block size
- Transaction rate [IO/s] = Data throughput [KiB/s] / Block size [KiB]
Controller information on Linux
# Find controller information
$sudo lspci -vv | grep -i raid
# Find raid has been configured or not and check the output
$cat cat /proc/scsi/scsi
BBU vs CacheVault vs CacheCade[4]
Battery Backup Unit (BBU)
The BBU’s job is to remember the data that hasn’t been synced to disk yet, usually up to 72 hours without power. When the machine powers back up, the BBU will write the cache contents on the disk.
If your “write cache” option is set to “write through” or “off”, then you should be fine without a BBU on your raid card. The downside to having “write cache” turned off is the RAID performance will be sub-optimal. Many RAID cards will have the write cache setting “on” only if the BBU is installed. This is very important for users who frequently save their database from certain types of corruption or need high data integrity.
CacheVault Cache Protection
CV does exactly what the BBU does, but is a newer technology that enables other features. What makes CacheVault superior in this aspect is the ability for that data to be moved from DRAM to NAND flash. When the data is moved from DRAM to NAND flash, it can be stored there for up to 3 years! When the server turns back on, data is moved from NAND back to DRAM and then written to the disks. Operations then continue as if nothing ever happened.[5]
CacheCade SSD Cache
While BBU and CacheVault are both physical module add-ons, CacheCade is a RAID controller software that enables an SSD Read/Write cache for the array. It allows you to optimize the existing HDD arrays with the SSD-based flash cache.
CacheCade will create a front-side flash cache for the “hottest” array. This flash cache reads/writes to the SSD which is much more efficient than reading/writing to the HDD array. CacheCade content remains intact upon reboot.
References
- ↑ https://sp.ts.fujitsu.com/dmsp/Publications/public/wp-raid-controller-performance-2021-ww-en.pdf
- ↑ https://www.cisco.com/c/dam/en/us/td/docs/unified_computing/ucs/3rd-party/lsi/mrsas/userguide/LSI_MR_SAS_SW_UG.pdf
- ↑ https://www.pitsdatarecovery.net/raid-redundancy-over-performance/
- ↑ https://www.hostdime.com/blog/bbu-vs-cachevault-vs-cachecade/
- ↑ https://www.youtube.com/watch?v=mMFcLQWPX5g&t=2s