My rig with a BIOSTAR TB250-BTC board was constantly logging PCIe Bus Error messages under /var/log/kern.log and /var/log/sys.log. About twenty GBs worth or log files!

Beyond the logging errors, I couldn’t have more than four GPUs attached until I performed the below fix. FYI, I am using five ZOTAC GeForce GTX 1060 AMP Edition (model: ZT-P10600B-10M) cards.

The solution: you need to enable “Miner Mode” in the BIOS Settings for the board.

  1. During boot hold the delete key until you enter the motherboard setup.
  2. Once in, navigate to: Chipset => Miner Mode => Set to [Enabled]

For reference, here’s the error that was filling my logs:

pcieport 0000:00:1c.7:   device [8086:a297] error status/mask=00000001/00002000
pcieport 0000:00:1c.7:    [ 0] Receiver Error         (First)
pcieport 0000:00:1c.7: AER: Corrected error received: id=00e7
pcieport 0000:00:1c.7: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e7(Receiver ID)

Doing research online led me down a couple of paths that are NOT needed, and revolved around adding pci flags to /etc/default/grub. Some red-herring suggestions were:

  • GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash pci=nommconf”
  • GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash pci=nomsi”

Lesson for the future: After building rigs it would be worth seeing if errors are being perpetually written to the /var/log/ directory. You may not realize it until you either run out of space or if the error finally manifests itself in a way that will cause you to investigate. In my case it was added a fifth GPU.