AER (Advanced Error Reporting)

From HPCWIKI
Revision as of 16:35, 2 April 2023 by Admin (talk | contribs)
Jump to navigation Jump to search

NVMe AER Issues

There are many community reports AER error on boot on various systems. [1]

Kernel boot error

The pci=noaer directive tells AER to not report errors. Those error reports would go into a log file, and each error sends a time-consuming interrupt request (IRQ) to the central processor. A rapid flow of error reports could thus flood the drive -- and clog NVMe bandwidth, slowing or even halting bootup.

The nvme 0000:xx:xx.x AER message identifies that error as from the NVMe M.2 connection to the PCIe bus.

So, the NVMe drive may be healthy, but there could be trouble brewing around the PCIe subsystem

Reference