Freeze when adding a second gpu

hi there.

I added today a second GPU to my system, identical to the first on ( GTX 1070 )
when I boot, after a few seconds ( a bit random, could be 1 or 2 minutes at most ) the whole system freezes completely.

any idea what it can be?
if I stay in BIOS it doesn’t freeze. I’ll try another OS now on a usb, to see if it freezes there.
then I guess it might be a hardware issue.

what’s the proper way to set up a manjaro system when adding a second gpu?
( I’m on i3 and 5.9 with testing branch )

thank you

You could check journalctl to see what happened when the system froze.

I think this is one of those

Nov 06 18:33:01 styx kernel: audit: type=1131 audit(1604683981.343:245): pid=1 uid=0 auid=4294967295 ses=42949672>
Nov 06 18:33:05 styx kernel: NVRM: GPU at PCI:0000:0a:00: GPU-9e104a96-817c-07e2-09f3-f4e8acc9a142
Nov 06 18:33:05 styx kernel: NVRM: GPU Board Serial Number: 
Nov 06 18:33:05 styx kernel: NVRM: Xid (PCI:0000:0a:00): 79, pid=1156, GPU has fallen off the bus.
Nov 06 18:33:05 styx kernel: NVRM: GPU 0000:0a:00.0: GPU has fallen off the bus.
Nov 06 18:33:05 styx kernel: NVRM: GPU 0000:0a:00.0: GPU is on Board .
Nov 06 18:33:05 styx kernel: NVRM: A GPU crash dump has been created. If possible, please run
                             NVRM: nvidia-bug-report.sh as root to collect this data before
                             NVRM: the NVIDIA kernel module is unloaded.

Have you checked if it’s connected properly?

I was suspecting that the connection was the problem.
I had both on riser cables, and I thought one of them is problematic.

Took them off the riser cables and put them directly on the mobo.
Now, I can boot, it sees both of them, but as soon as I open steam ( to test on a game ) it shuts down.
It doesn’t freeze like before, but shuts down the monitors and keyboard ( I can see the leds turn off ).

will try to get logs now

I cannot see anything wrong in the logs.
I installed nvidia prime, to see if it helps, but the same problem. Not sure why it shuts down.

unfortunately, this turned to a distro setup issue.

I removed the second card, left only the original one, and I still get the random shut down.
it can happen even before the login screen, or 5 mins after I log in, that’s the min and max.

I happen to have an old antergos setup on another partition, so I logged in there to see if it still happens.
It doesn’t.
I am writing that post from there.

So, it seems like my manjaro setup is broken and for some reason it shuts down completely randomly.

I cannot see any error or anything GPU / PCI related in the journal logs.
How can I fix this? Any idea?
I don’t want to have to reinstall manjaro, I have that setup for years, lots of installed things in there :frowning:

it does it now even when I boot on a live usb
also, I get a /boot/vmlinuz 5.9 not found on my normal setup, so I cannot boot there anymore.
changed gpu, still the same.
this is insane…
any help would be appreciated. I cannot understand how it went so wrong, all I wanted was to add a 2nd gpu :frowning:

I tried the manjaro-chroot solution, but I still get the vmlinuz missing message.
booting in the live usb, some times shuts down, so I don’t know how to debug this and what to do.

surprisingly, booting into the old antergos setup is the only stable solution, it never shuts down there. so confusing.

any ideas?