SSD

From HPCWIKI
Jump to navigation Jump to search


Solid-state drives (SSDs) come with a variety of connectors, connection protocols, underlying technologies and form factors. The primary types of SSDs are the 2.5”, M.2 (SATA & NVMe), NVMe PCIe and the U.2 (formerly SFF-8639) or U.3 SSD, each offering distinct advantages and disadvantages.

Sata sas nvme u.2.png
Type Connector Protocol Technology Form Factor ETC. Connector Bandwidth
M.2 SATA SSD M.2 SATA SATA M.2 - 22 or 30mm wide

- 2280, 1630, 3030

0.6GB/s
M.2 NVMe SSD M.2 PCIe NVMe M.2 8GB/s
2.5" SATA SSD SATA SATA SATA 2.5" 0.6GB/s
2.5" U.2 SSD U.2 (SFF-8639) PCIe/SAS/SATA NVMe 2.5" sff-8639 8GB/s
PCIe Add-in-Card(AIC) SSD PCIe PCIe NVMe PCIe AIC

(Add in Card)

8GB/s


Tri-Mode

Tri-mode[1]

According to OCP-Trimode-Presentation, SAS and NVMe forecasted to increase over the coming years and SATA to decrease.


U.3 is a ‘Tri-mode’ standard, building on the U.2 spec and using the same SFF-8639 connector. It combines SAS, SATA, and NVMe support into a single controller. So, U.3 only requires 1 backplane, 1 mid-plane, and 1 controller, supporting all these drives in the same slot.

Tri-mode controllers alone are not enough, here the disc backplanes also come into play.

For the perfect Tri-Mode operation, system needs to have

• One Backplane

• One connector

• Less high-speed lanes to backplane

• One Mid-plane

• Tri-mode Expander

• One HBA / RAID Controller

Tri-Mode controllers should be PCIe 4.0 and it means that the slot and our disks in the server should be 24G.

On a PCIe 3.0 server, you can use all disk types at the same time with tri-mode controls and U.2 and/or U.3 disk backplane, of course, it will work at PCIe 3.0 and U.2 speed in terms of performance

Storage Today vs Tri-mode

Tri-Mode technology brings a wealth of options and flexibility using of SAS devices, Serial ATA (SATA) II and SATA III devices, and PCIe (NVMe) within the same storage infrastructure.

The Tri-Mode controller, for example, Broadcom 9600 Series provides[2]

  • SAS Serial SCSI Protocol (SSP), which enables communication with other SAS devices
  •  SATA III, which enables communication with other SATA II and SATA III devices
  •  Serial Management Protocol (SMP), which communicates the topology management information directly with an attached SAS expander device
  •  Serial Tunneling Protocol (STP), which enables communication with a SATA III device through an attached expander
  • NVMe, which accesses storage media that is attached by a PCIe bus. Both NVMe SGL and PRP drives are supported. The controller firmware does not use the SGL capabilities of the drives. I/Os are issued based on the PRP format.


Storage Today[3]
Tri-mode storage[4]


NVMe Specification 2.0 Key Features[5]

NVMe roadmap.png

NVMe 2.0 family of specifications, were released on June 3, 2021.[6]


NVMe 2.0 is a significant improvement over 1.4

NVMe technology is the leading interface for SSDs, with overall worldwide enterprise SSD capacity expected to grow at a 43% compound annual growth rate into 2024


Zoned Namespaces (ZNS)

This feature provides the interface that allows NVMe SSD and host to collaborate on data placement. It can align the data to the physical media of the SSD, improving overall performance and increasing the capacity that can be exposed to the host. More than that, write amplification is significantly improved with ZNS. In other words, the use of ZNS can extend the lifespan of NVMe SSDs.


NVMe-KV

It allows access to the data on an NVMe SSD namespace using a key rather than a logical block address. the KV SSD allows users to access key-value data without the costly and time-consuming overhead of additional translation tables between keys and logical blocks [4].

NVMe Endurance Group Management

with this feature the NVM subsystem can have more flexibility to isolate the I/O performance effects and wear-leveling operation of different users on shared drives or arrays.

SSD Wear Out

Solid-state drives today are almost universally comprised of NAND flash, which wears out with use. Each flash memory cell can only be written so many times before it becomes unreliable. Generally, reads do not wear out NAND flash.

Measuring wear

In Linux, we can check SSD or NVMe wear reliability counter as using smartctl.

To check the health of a SSD
For Ubuntu, Mint, or Debian based distributions
# apt-get install smartmontools

The Media_Wearout_Indicator is what you are looking for. For 100 means your ssd has 100% life, 
the lower number means less life left.
# smartctl -a /dev/sda | grep Media_Wearout_Indicator

To show your sdd information
# smartctl -a /dev/sda

For (at least some) NVMe drives, you can do
smartctl -a /dev/nvme0

You can then look for a line like:
Percentage Used:                    5%

Again here lower numbers are better and 100% means the drive is "worn out".

Quantifying flash endurance

Measuring wear is one thing, but how can we predict the longevity of an SSD?

Flash “endurance” is commonly measured in two ways:

  • Drive Writes Per Day (DWPD)
  • Terabytes Written (TBW)

Both approaches are based on the manufacturer’s warranty period for the drive, its so-called “lifetime”.

Drive Writes Per Day (DWPD)

Drive Writes Per Day (DWPD) measures how many times you could overwrite the drive’s entire size each day of its life. For example, suppose your drive is 200 GB and its warranty period is 5 years. If its DWPD is 1, that means you can write 200 GB (its size, one time) into it every single day for the next five years.


If you multiply that out, that’s 200 GB per day × 365 days/year × 5 years = 365 TB of cumulative writes before you may need to replace it.


If its DWPD was 10 instead of 1, that would mean you can write 10 × 200 GB = 2 TB (its size, ten times) into it every day. Correspondingly, that’s 3,650 TB = 3.65 PB of cumulative writes over 5 years.

Terabytes Written (TBW)

Terabytes Written (TBW) directly measures how much you can write cumulatively into the drive over its lifetime. Essentially, it just includes the multiplication we did above in the measurement itself.

For example, if your drive is rated for 365 TBW, that means you can write 365 TB into it before you may need to replace it.

If its warranty period is 5 years, that works out to 365 TB ÷ (5 years × 365 days/year) = 200 GB of writes per day. If your drive was 200 GB in size, that’s equivalent to 1 DWPD. Correspondingly, if your drive was rated for 3.65 PBW = 3,650 TBW, that works out to 2 TB of writes per day, or 10 DWPD.

As you can see, if you know the drive’s size and warranty period, you can always get from DWPD to TBW or vice-versa with some simple multiplications or divisions. The two measurements are really very similar.

What’s the difference?

The only real difference is that DWPD depends on the drive’s size whereas TBW does not.

EDSFF

EDSFF stands for Enterprise and Data Center Standard Form Factor previously known as the Enterprise and Data Center SSD Form Factor is a family of SSD form factors for use in data centers[7]

Samsung's PM983 - NGSFF (also known as M.3 or NF1) form factor competes with EDSFF[8].

EDSFF Device Form Facto[9]
Variation Height Length Thickness
E3.S 76mm 112.75mm 7.5mm
E3.S 2T 76mm 112.75mm 16.8mm
E3.L 76mm 142.2mm 7.5mm
E3.L 2T 76mm 142.2mm 16.8mm

Samsung PM9A3 vs. Samsung PM983

Samsung PM9A3 specificatio

PM9A3 offers better NAND and a new controller (V6 TLC and Elpis 8-channel, respectively) compared to the PM983’s V5 TLC NAND and Phoenix 8-channel controller[10]

PCIe 4.0 SSD

as of SAN JOSE, Calif., April 26, 2022, Solidigm introduced new series of SSD - D7-P5520 and the D7-P5620 - for high performance with zero tolerance for data errors. D7-P5520 (designed for read-intensive and light mixed workloads) and the D7-P5620 (designed for mixed workloads)[11].

Hothardware's performance benchmark shows competitive performance against competitors[12] in the market.

Reference