CPU cores overload and get stuck at 100% one by one when cloning a big git repo

First of all, I’m not a linux expert, I’m trying to figure out as I go but sometimes it’s more than I can handle.

I’ve been trying to get a manjaro 6.1 (Talos) to work for about a week now. Previously I had a 5.15 installation which worked fine but when I formatted and upgraded, errors started happening.

This manjaro is one of the guest OSs of an arch host.

The problems started when I first installed manjaro 6.1, I selected the open source drivers, which when I booted the OS made all my cores get stuck on 100%, slowly, one by one, and the applications started to freeze one by one until the system was absolutely unusable and I had to force shutdown the guest OS on the host. This happened when I was cloning a huge repo which was the whole reason for that guest OS existing.

Then I formatted that partition and reinstalled manjaro using the proprietary drivers, which worked perfectly and seemed to have fixed the whole issue. I was able to clone the repo and start working on it. Until I started having problems with baloo-file-extractor which hogged the system’s memory like it was the only relevant process. Many other people seem to have had enough with it and suggested a forced manual removal of all baloo-related files, which made me run:

sudo find / -type f -name '*baloo*' -delete

After that manjaro was running like the wind. That’s when I deleted the repo and tried cloning it again (may have been a mistake but I needed to be sure). And that brought all the problems back up again. One by one, each core explode to 100% and stay stuck forever. Applications one by one get stuck too, even the network manager.

So I don’t know anymore what is the problem. nvidia? baloo? git? the manjaro/plasma mix?

Can anybody shed any kind of light on this? I’m starting to lose my hair over this.

Here is some helpful info (if you need anything else please let me know):

$ sudo inxi -Fazy
  System:
    Kernel: 6.1.30-1-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 12.2.1
      parameters: BOOT_IMAGE=/boot/vmlinuz-6.1-x86_64
      root=UUID=74c20d02-d0eb-47b1-bdff-46aeca800c85 rw quiet splash
      udev.log_priority=3
    Desktop: KDE Plasma v: 5.27.4 tk: Qt v: 5.15.9 wm: kwin_x11 dm: SDDM
      Distro: Manjaro Linux base: Arch Linux
  Machine:
    Type: Vm-other System: Dell product: XPS 8500 v: pc-q35-5.2 serial: N/A
      Chassis: QEMU type: 1 v: pc-q35-5.2 serial: N/A
    Mobo: N/A model: N/A serial: N/A UEFI: Dell v: Default System date: N/A
  CPU:
    Info: model: Intel Core i9-10900K bits: 64 type: MT MCP arch: Comet Lake
      gen: core 10 level: v3 note: check built: 2020 process: Intel 14nm family: 6
      model-id: 0xA5 (165) stepping: 5 microcode: 0xF6
    Topology: cpus: 1x cores: 10 tpc: 2 threads: 20 smt: enabled cache:
      L1: 1.2 MiB desc: d-20x32 KiB; i-20x32 KiB L2: 40 MiB desc: 10x4 MiB
      L3: 16 MiB desc: 1x16 MiB
    Speed (MHz): avg: 3696 min/max: N/A base/boost: 2000/2000 cores: 1: 3696
      2: 3696 3: 3696 4: 3696 5: 3696 6: 3696 7: 3696 8: 3696 9: 3696 10: 3696
      11: 3696 12: 3696 13: 3696 14: 3696 15: 3696 16: 3696 17: 3696 18: 3696
      19: 3696 20: 3696 bogomips: 147896
    Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
    Vulnerabilities:
    Type: itlb_multihit status: Not affected
    Type: l1tf status: Not affected
    Type: mds status: Not affected
    Type: meltdown status: Not affected
    Type: mmio_stale_data status: Vulnerable: Clear CPU buffers attempted, no
      microcode; SMT Host state unknown
    Type: retbleed mitigation: Enhanced IBRS
    Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
      prctl
    Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
      sanitization
    Type: spectre_v2 mitigation: Enhanced IBRS, IBPB: conditional, RSB
      filling, PBRSB-eIBRS: SW sequence
    Type: srbds status: Unknown: Dependent on hypervisor status
    Type: tsx_async_abort status: Not affected
  Graphics:
    Device-1: NVIDIA TU104 [GeForce RTX 2080 SUPER] vendor: ASUSTeK
      driver: nvidia v: 530.41.03 alternate: nouveau,nvidia_drm non-free: 530.xx+
      status: current (as of 2023-05) arch: Turing code: TUxxx
      process: TSMC 12nm FF built: 2018-22 pcie: gen: 3 speed: 8 GT/s lanes: 16
      bus-ID: 04:00.0 chip-ID: 10de:1e81 class-ID: 0300
    Display: x11 server: X.Org v: 21.1.8 with: Xwayland v: 23.1.1
      compositor: kwin_x11 driver: X: loaded: nvidia gpu: nvidia display-ID: :0
      screens: 1
    Screen-1: 0 s-res: 3840x1080 s-dpi: 81 s-size: 1204x343mm (47.40x13.50")
      s-diag: 1252mm (49.29") monitors: <missing: xrandr>
    API: OpenGL v: 4.6.0 NVIDIA 530.41.03 renderer: NVIDIA GeForce RTX 2080
      SUPER/PCIe/SSE2 direct-render: Yes
  Audio:
    Device-1: NVIDIA TU104 HD Audio vendor: ASUSTeK driver: snd_hda_intel
      v: kernel pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 05:00.0
      chip-ID: 10de:10f8 class-ID: 0403
    API: ALSA v: k6.1.30-1-MANJARO status: kernel-api with: aoss
      type: oss-emulator tools: alsactl,alsamixer,amixer
    Server-1: JACK v: 1.9.22 status: off tools: N/A
    Server-2: PipeWire v: 0.3.70 status: n/a (root, process) with: wireplumber
      status: active tools: pw-cli,wpctl
    Server-3: PulseAudio v: 16.1 status: active (root, process)
      with: pulseaudio-alsa type: plugin tools: pacat,pactl
  Network:
    Device-1: Intel 82574L Gigabit Network driver: e1000e v: kernel pcie: gen: 1
      speed: 2.5 GT/s lanes: 1 port: e000 bus-ID: 01:00.0 chip-ID: 8086:10d3
      class-ID: 0200
    IF: enp1s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  Drives:
    Local Storage: total: 2.55 TiB used: 37.43 GiB (1.4%)
    ID-1: /dev/sda maj-min: 8:0 vendor: Seagate model: Expansion size: 1.82 TiB
      block-size: physical: 4096 B logical: 512 B type: USB rev: 2.1 spd: 480 Mb/s
      lanes: 1 mode: 2.0 tech: N/A serial: <filter> fw-rev: 0712 scheme: GPT
    SMART Message: A mandatory SMART command failed. Various possible causes.
    ID-2: /dev/vda maj-min: 254:0 model: N/A size: 750 GiB block-size:
      physical: 512 B logical: 512 B tech: N/A serial: N/A scheme: GPT
    SMART Message: Unknown smartctl error. Unable to generate data.
  Partition:
    ID-1: / raw-size: 749.7 GiB size: 736.87 GiB (98.29%) used: 37.43 GiB (5.1%)
      fs: ext4 block-size: 4096 B dev: /dev/vda2 maj-min: 254:2
    ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
      used: 288 KiB (0.1%) fs: vfat block-size: 512 B dev: /dev/vda1 maj-min: 254:1
  Swap:
    Alert: No swap data was found.
  Sensors:
    System Temperatures: cpu: N/A mobo: N/A gpu: nvidia temp: 36 C
    Fan Speeds (RPM): N/A gpu: nvidia fan: 0%
  Info:
    Processes: 328 Uptime: 10m wakeups: 0 Memory: available: 15.25 GiB
    used: 1.54 GiB (10.1%) Init: systemd v: 252 default: graphical
    tool: systemctl Compilers: gcc: 12.2.1 clang: 15.0.7 Packages: pm: pacman
    pkgs: 1126 libs: 323 tools: pamac pm: flatpak pkgs: 0 Shell: Zsh (sudo)
    v: 5.9 default: Bash v: 5.1.16 running-in: konsole inxi: 3.3.27

Talos is no longer current. The current Manjaro release is 23.0.0 Utima Thule.

Manjaro is a rolling release, so you need to update your system to the latest release.

There is no such thing. You are conflating the kernel version with the distribution version. Kernel 6.1 is an LTS kernel, and it’ll still be around — with continuing support in the form of patches and security fixes — for several more years, regardless of what the distribution will be called.

There’s no need to remove baloo. You can simply disable it.

I had disabled it but it always came back with a vengeance. The only way I found to not have the baloo process when I booted was the manual deletion of the files.

That could come around to bite you, because several things depend on it, including plasma-desktop. :man_shrugging:

Right, I’ll try to leave baloo be. It will only take some time while it indexes and then it will stop hogging the memory right?

How can I make sure baloo is not a problem anymore? Should I reinstall manjaro (not a huge problem, fresh installation)? Or it here any easier way?

Yes, that is correct. But you can also tailor it to index only what you want, and to exclude other things. In addition to that, you can have it index only the filenames, or the filenames and file content, and the latter is incredibly heavy on resources. :arrow_down:

Oh sweet. I’ll follow your advice and get back to you.

When you install in a virtual machine - the choice of driver is irrelvant - it will install drivers for vm.

Plasma is well known to cause CPU hogging when first installed due to the indexing feature - this is the way - complaiin upstream to KDE Plasma developers.

If all cores on your system is hogged and finally hangs the system - it is the configuration of your VM - you should never assign more than 80% of your physical cores to a VM - in your case having 10 cores - you should assign a maximum of 8 cores to your VM - the same rule for RAM.

Another consideration is the type of disk assigned to your VM. If it the dynamic type there is a huge overhead if the diskspace need to be allocated on fly.

A third consideration is the host’s filesystem where you store the virtual disks. Stay away from btrfs on partitions hosting virtual disks - this is a recipe for disaster.

1 Like

No bueno man, now with pristine manjaro 22.1.3 I only disabled baloo on that screen and I still get the same problem.

Read the post by @linux-aarhus. :slight_smile:

This is a fresh installation but if I leave my VM open, nothing happens, the file indexer is idle at 100% files indexed. Then when I run git clone, at about 40%-60% of the download the cores start to get at 100%, one by one, until the network manager go down and the cloning stops. At this point the system is pretty much bricked and I can only reboot it. This also happens if I add the project’s directory to the blacklist and even if I disable the file indexer.

I’ll change that and retry and let you know if there was any changes.

Its a partition disk. No worries there.

It’s ext4.

Do you mean a raw disk partition?

Cloning a large git repo shouldn’t hog your system - I am still thinking it is a matter of configuration.

How is your swap configuration?

Which repo are you cloning?

Assuming a public repo - which url can be used to verify?

Which type of hypervisor are you using - your inxi just says qemu?

Yes.

No swap.

It’s a private repo. The download is about 24GB.

It’s KVM running on virtual machine manager.

So I’ve tested with only 80% of the cores and the problem still persists.

The host needs swap.

I assume you are famililar with the Arch Wiki on the subject?

If you start your virtual machine with a GUI tool and experience very bad performance, you should check for proper KVM support, as QEMU may be falling back to software emulation.
QEMU - ArchWiki

Oh, on the host I got 4GB swap on a swap file.

Sure, I dont know it by heart but I used it a lot when installing arch or to solve the many issues I had with it.

While peeking into virt-manager and qemu/kvm - I remember why I am using VirtualBox :slightly_smiling_face:

I did manage to setup a Manjaro system - but I am not impressed with my achievement and rhe resulting vm.

The first hurdle was to get display to function - if you just boot the ISO - the display is not redrawn and windows is just boxes - I eventually got passed it - but I have much to learn in that regard.

I’m trying to add a couple of pictures to show the problem but I’m getting:

An error occurred: Sorry, you can’t embed media items in a post.

I’m starting to get hopeless. I think I may reinstall it again just for the sake of it and try to clone before doing absolutely anything in it. And if it fails I’m gonna look for another distro (which will be sad because I’ve grown accustomed with manjaro).

1 Like

Yes that is until you get some reputaion :slight_smile:

I decided to revisit - one is never too old to learn - but I keep banging against

I/O error

image

The difference between now and earlier today is the disk which this time is raw on a 75G partition - last time it was a qcow file on ext4 formatted partition.

Oh sweet, let me know if you need anything. I’m not an expert on this but I did set up many VMs over the past year so I can be of a little help.

1 Like

I managed through some experimentation to get a vm running.

  • A physical 75G partition prepared with ext4
  • Created a storage pool pointing to the partition
  • Mounting the pool at the default mountpoint /var/lib/libvirt/images/pool
  • Created a storage unit named manjaro using the raw option
    • it could have been qcow
    • did it to get as close as possible to the topic at hand with my test
    • I don’t think there is any gain to use the raw format vs the qcow format
  • Create a new VM
    • vm os → Manjaro
    • memory → 16G
    • cpu → 8
    • storage → custom → manage → pool → manjaro.img
    • name → manjaro
    • Customize configuration before install
  • Display
    • type → spice server
    • listen type → none
    • OpenGl → checked → pointing to my GPU
  • Video
    • Model → Virtio
    • 3D accelaration → checked
  • Start Installation
    • It is important to choose swap inside the Manjaro installation

    • Usually selecting to use a swapfile is sufficient

I cannot test a git clone of a 25G repo but I did test a full system sync - downloading 1.5G unpacking 3.5G - no issues.

One could of course simulate something like it by pulling 10 ISO files over the internet but I think it is overkill for this

For the sake of stress testing the VM

image

I however learned the basics of creating a VM using virt-manager - so not wasted time, not at all.

I will likely go over the process one more time and convert my learning process to a guide.

1 Like