Fixing Blue Screen and crashes with a brand new RTX card by checking your RAM

So you just got a brand new NVidia RTX 3000 or 4000 series GPU, you already have a powerful power supply (750W+) and you PC shows one of the following symptoms:

  • Blue Screens of Death;
  • Artifacting in games;
  • Random games and app crashing;
  • Computer freezes;
  • Black screen crashes.

Well it’s most likely not your CPU, motherboard or power supply dying.

What is XMP?

XMP stands for eXtreme Memory Profile, it’s an Intel thing, other names for it can be DOCP (on some Asus AM4 motherboards) or RAMP (Ryzen Accelerated Memory Profile). To make things simple I’ll call this XMP regardless of Intel or AMD since it stands for the same thing.

When you build your computer (or have it built) you will most certainly benefit by enabling XMP in your UEFI if your RAM comes with it. In my case I have some Corsair Vengeance RAM so I’m using the profile to benefit from the advertised 3600MHz on the kit I bought.

You have enabled XMP on your RAM

The crashes are most likely caused by having the XMP profile enabled, one quick way to check is to run a memtest86 (it’s free), set it up on a flash drive and boot from it. Run a long test and you should start seeing errors pile up.

As seen on SuperUser, red is bad

Depending on the amount of RAM you have you will see errors appear within an hour or even witness a crash.

Once you see errors feel free to cancel the test, reboot into the UEFI and disable the XMP profile (refer to your motherboard manual for instructions). Once this is done you will need to run another test but this time let it run for a couple of hours longer than previously.

Testing your system’s stability is very important and only memtest86 can do it effectively. DO NOT TRUST WINDOWS’ RAM CHECKING TOOL.

If everything goes well and no other error is detected than it means we have fixed the crashes.

Fix #1: No XMP profile

The cold, hard and sad truth is that you can no longer run the XMP profile. Just disable it to fix the problem and be done with it.

This might not be the answer you were looking for but this is the easiest way.

Memtest86 doesn’t lie and when is couldn’t find any errors without XMP enabled over several hours you can consider the problem fixed and everything will be going smoothly from there.

But… If you are willing to take risks there’s a second fix…

Fix #2: Increase RAM voltage

Disclaimer: You are responsible for whatever you do to your PC, everything is on you no matter what. Any damage caused by following these instructions are on you.
Overclocking and overvolting is not covered by your warranty and if something breaks it’s on you.

To get started you’ll have to enable your XMP profile once more, then you will have an input field for the RAM voltage. In my case the default setting was showing 1.35V.

I decided to increase the voltage to 1.37V and run memtest86. I found no errors within the hours which was good but not a promise. I let the test run for 16 hours total and I still didn’t have any errors.

In my case 1.37V is good enough. Your case is different so you’ll have to try 1.36V first, then increase it until it’s stable or your hardware can’t take anymore.
Be careful because you can damage your hardware and this kind of damage isn’t covered by any warranty.

What caused the crashes actually?

I can only speculate but I think it’s related to the transient power spikes being caused by high powered GPUs such as the RTX 3000 and 4000 series.
I’m speculating that more power is being sucked into the PCIE connector on the motherboard too, potentially causing a voltage drop in some other places where it matters such as the RAM.

I’m not sure if this is a bad design coming from the GPU, the motherboard or even the RAM but it does cause some sort of instability that then causes artifacting in some games, random blue screens (BSoD) with unrelated error codes from each other sometimes (see my screenshot below) or even other issues.

Gamers Nexus did produce a video to explain the transients in more detail but I feel that it might only be scratching the surface for the issues that some of us might encounter.

My hardware

At the time of writing I’m rocking my BEEF PC that is build around the following components:

  • AMD Ryzen 9 5900X;
  • NVidia RTX 3080Ti (Gigabyte);
  • 64GB Corsair Vengeance 3600MHz RAM;
  • Asus TUF GAMING X570-PRO motherboard;
  • Corsair 850W PSU.

This was tested and has been stable ever since I’ve increased my RAM voltage.
I have tested this on another computer with the following specs:

  • AMD Ryzen 7 5800X;
  • NVidia RTX 3090 (FE);
  • 32GB Corsair Vengeance 3200MHz RAM (voltage bumped to 1.37V);
  • MSI Tomahawk B450;
  • Corsair 1000W PSU.

Conclusion

Those new RTX 3000 and 4000 series are pretty powerful and pull all the power they can get and some more!

I understand that most of the reasoning I went through was speculation and it feels like I’m making up a reason. I’m honestly not even convinced by what I think is the problem but increasing the voltage did fix it for me so I’m not going to argue that I’m running a stable system.

 

Published by

SenpaiSilver

Junk food tastes good.