CPU cores overload and get stuck at 100% one by one when cloning a big git repo

Wow man thanks for the commitment to solving the issue!

I have reinstalled manjaro 22.1.3 and before doing anything I cloned the repo. And… it’s working fine. Currently the download has finished and it’s now processing the files. I believe that something I did between first boot and git clone was causing my issues. Here’s a list of what I did:

  1. Enabled AUR at /etc/pamac.conf.
  2. Installed editors:
sudo pamac install vscodium-bin
sudo pamac install vscodium-bin-marketplace
sudo pamac install sublime-text-4
  1. Updated pacman:
sudo pacman -Syyu
  1. Installed pacman package
sudo pacman -S diffuse
  1. Changed manjaro theme to Breeze Dark.
  2. Opened firefox and downloaded all extensions I use.
  3. Got user token from github.
  4. Cloned the repo.

May be a little too thorough but you never know what little things can cause huge issues.

So I’m gonna snapshot the partition and start making some tests. The curiosity is getting the best of me.

If you think you know which step bricks manjaro let me know, otherwise I’ll let you know when I find.

My test pulled the above 7 ISO without any hickups.

Screenshot

image

And you managed to clone the repo - fantastic - :fireworks:

I did run into some issues but those is entirely credited my lack of experience and knowledge around qemu/kvm and virt-manager.

In the end the tests did not reveal any issues with Manjaro as such - really not expected either - but equally good to confirm.

My host also runs Manjaro - not a surprise really - but worth mentioning

:slight_smile:

So I just tried again without changing anything, just booted and cloned the repo again, in the off chance that the previous successful cloning was just a one off. And… the problems returned. So it must not be any of the actions between first boot and repo cloning.

I have no idea what is the problem and I don’t even know where to start looking. I’m either going to restore the snapshot with the repo intact and use it like that and hope nothing like this happen in the future or I’m just gonna install arch and get this over with.

I’m beat.

Thank you very much for the help @linux-aarhus and @Aragorn.

1 Like

Jugding anything by the result of cloning a 25GB repo is - in my opnion edgy.

Whether you use Arch or Manjaro - the same thing can happen - as you have a working clone - you shouldn’t have the need to clone again.

I think what happens is a combination of different factors.

One factor is the size of the repo - it is huge - I have a hard time imagining what could create a repo of that size - perhaps the lack of swap is making your system choke on the size - remember /tmp is allocated from RAM.

Another factor is the hosting of such repo - a repo of that size is likely selfhosted - and the database behind may need maintenance. Also - from experience - gitea is great - I have setup an instance at the company server (Win2019) - to host the code I work on.

As you have no swap - you need to setup swap, secondly I suggest you see if tweaking zswap - ArchWiki will bring any chnage.

2 Likes

New developments!

I’ve tried the same workflow with swap (16GB) and with xfce (separately) and both failed the same way as before: Install, boot, clone, reboot, clone again, error.

What I noticed was that all the times it worked was on the first boot. I tried booting after an install and then instantly rebooting and wasn’t even able to get it to clone once. I installed again and have cloned the repo 3 times in the first boot without any issues. So it seems that something happens when manjaro is shutting off that compromises subsequent boots. May be something in manjaro itself or some VM configuration.

Any guesses?

I’m hopeful again lol

I have been doing some experimenting also.

My goal is to be able to replicate your issue - not much success - I admit. My vm has been rebooted several times and is also updated to

I have been thinking - what repo could possibly be a challenge - then it struck me that the kernel sources could be a worthy test.

The linux-source repo clone went without any issues.

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Cloning into 'linux-stable'...
remote: Enumerating objects: 11487511, done.
remote: Counting objects: 100% (3953/3953), done.
remote: Compressing objects: 100% (2263/2263), done.
remote: Total 11487511 (delta 2912), reused 2100 (delta 1686), pack-reused 11483558
Receiving objects: 100% (11487511/11487511), 4.47 GiB | 13.47 MiB/s, done.
Resolving deltas: 100% (9170170/9170170), done.
Updating files: 100% (80340/80340), done.

So I considered another option - android sources is also a huge repo - let’s see how that goes.

While going over the steps to possibly test cloning android sources - I saw this note on network - I wonder if that could be part of your issue?

More rarely, Linux clients experience connectivity issues, getting stuck in the middle of downloads (typically during receiving objects). Adjusting the settings of the TCP/IP stack and using non-parallel commands can improve the situation. You must have root access to modify the TCP setting:

sudo sysctl -w net.ipv4.tcp_window_scaling=0

Downloading the Source | Android Open Source Project

I am in the process of creating a local mirror of android by following the instructions found using the above link. I am creating this inside the virtual machine I created yesterday (I did build and install the custom bochs package from AUR). I have not tweaked my net settings as it is rarely necessary.

inxi -F @ http://ix.io/4ySQ

Test is still running - I think I run out of diskspace before it is done - at the time of writing it has been running for more than 50 minutes and pulled more than 39GiB.

Screenshot

The problem happens before any network issues. Actually, the processor overload causes the network issues because it causes every application to grind to a halt.

I didn’t mention before but xfce’s problem is different. It doesnt brick the system completely. The cores still get overloaded but I can open other apps. But when I reboot, it undoes the cloning fragments, just like on plasma.

I did run out of diskspace - I had 70G - as I have not been able to reproduce - I have no idea what is causing your issue - I am leaning towards something local for your system.

I noted that your inxi for your vm was very different from mine.

Because of this major difference it must be local but as to what - I am clueless …

Your inxi -Fazy
My inxi -F
System:
  Host: mjro-qemu Kernel: 6.1.35-1-MANJARO arch: x86_64 bits: 64
    Desktop: KDE Plasma v: 5.27.6 Distro: Manjaro Linux
Machine:
  Type: Kvm System: QEMU product: Standard PC (Q35 + ICH9, 2009) v: pc-q35-8.0
    serial: <superuser required>
  Mobo: N/A model: N/A serial: N/A UEFI: EDK II v: N/A date: 2/2/2022
CPU:
  Info: 8x 1-core model: AMD Ryzen Threadripper PRO 5945WX s bits: 64
    type: SMP cache: L2: 8x 512 KiB (4 MiB)
  Speed (MHz): avg: 4092 min/max: N/A cores: 1: 4092 2: 4092 3: 4092 4: 4092
    5: 4092 6: 4092 7: 4092 8: 4092
Graphics:
  Device-1: Red Hat Virtio 1.0 GPU driver: virtio-pci v: 1
  Display: x11 server: X.org v: 1.21.1.8 driver: X: loaded: modesetting
    dri: virtio_gpu gpu: virtio-pci resolution: 1920x1080~60Hz
  API: OpenGL Message: Unable to show GL data. Required tool glxinfo
    missing.
Audio:
  Device-1: Intel 82801I HD Audio driver: snd_hda_intel
  API: ALSA v: k6.1.35-1-MANJARO status: kernel-api
Network:
  Device-1: Red Hat Virtio 1.0 network driver: virtio-pci
  IF-ID-1: enp1s0 state: up speed: -1 duplex: unknown mac: 52:54:00:d1:56:2f
Drives:
  Local Storage: total: 70 GiB used: 50.63 GiB (72.3%)
  ID-1: /dev/vda model: N/A size: 70 GiB
Partition:
  ID-1: / size: 68.05 GiB used: 50.63 GiB (74.4%) fs: ext4 dev: /dev/vda2
  ID-2: /boot/efi size: 299.4 MiB used: 288 KiB (0.1%) fs: vfat
    dev: /dev/vda1
Swap:
  ID-1: swap-1 type: file size: 512 MiB used: 51.4 MiB (10.0%) file: /swapfile
Sensors:
  Src: lm-sensors+/sys Message: No sensor data found using /sys/class/hwmon
    or lm-sensors.
Info:
  Processes: 213 Uptime: 1h 12m Memory: available: 15.6 GiB
  used: 1.85 GiB (11.8%) Shell: Bash inxi: 3.3.27

Are you passsing through an nvidia card? Could it be nvidia related?

I am not pasning anything. It is all virtualized.

I do not own nvidia so I cannot say.

Quite possible