Python very slow on fresh manjaro install

CPU: 2x 48-core AMD EPYC 7642 (-MCP SMP-) speed/min/max: 1576/1500/2300 MHz
Kernel: 6.6.16-2-MANJARO x86_64 Up: 8m Mem: 4.88/251.73 GiB (1.9%)
Storage: 1.85 TiB (1.2% used) Procs: 1171 Shell: Bash inxi: 3.3.33

Hi there, I have built a dual epyc cpu on a supermicro m/b to run python scripts on very large numbers (thousands/millions of digits). On windoze 10 the scripts run at a reasonable speed. On this machine is it very, very slow. I notice the cpu speed/min/max is not correct.

It is a fresh install, all up to date. Is there something I have messed up? BIOS looks ok. These CPU’s are base 2.3 boost up to 3.3Ghz. Thanks for your help.

inxi -Fazy

trying to work out how to paste contents or a link to txt file. forum not allowing me to post a link (can’t include links in your post)

Welcome to the forum! :wave:

See [HowTo] Post command output and file content as formatted text

There are quite a few pastebin hosts allowed for new users, you should have no issue pasting a link unless it’s some obscure one.

System:
  Kernel: 6.6.16-2-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
    clocksource: tsc avail: hpet,acpi_pm
    parameters: BOOT_IMAGE=/boot/vmlinuz-6.6-x86_64
    root=UUID=6c3c70d5-994f-4aec-b639-c548602c205d rw nouveau.modeset=0 quiet
    cryptdevice=UUID=5ad75e2b-6b15-442b-a01e-12101132d822:luks-5ad75e2b-6b15-442b-a01e-12101132d822
    root=/dev/mapper/luks-5ad75e2b-6b15-442b-a01e-12101132d822 splash
    apparmor=1 security=apparmor udev.log_priority=3
  Desktop: Xfce v: 4.18.1 tk: Gtk v: 3.24.36 wm: xfwm4 v: 4.18.0
    with: xfce4-panel tools: xfce4-screensaver vt: 7 dm: LightDM v: 1.32.0
    Distro: Manjaro base: Arch Linux
Machine:
  Type: Server System: Supermicro product: Super Server v: 0123456789
    serial: <superuser required> Chassis: type: 17 v: 0123456789
    serial: <superuser required>
  Mobo: Supermicro model: H11DSi-NT v: 2.00 serial: <superuser required>
    uuid: <superuser required> UEFI-[Legacy]: American Megatrends v: 2.3
    date: 08/02/2021
CPU:
  Info: model: AMD EPYC 7642 bits: 64 type: MCP SMP arch: Zen 2 gen: 3
    level: v3 note: check built: 2020-22 process: TSMC n7 (7nm) family: 0x17 (23)
    model-id: 0x31 (49) stepping: 0 microcode: 0x830107B
  Topology: cpus: 2x cores: 48 smt: <unsupported> cache: L1: 2x 3 MiB (6 MiB)
    desc: d-48x32 KiB; i-48x32 KiB L2: 2x 24 MiB (48 MiB) desc: 48x512 KiB
    L3: 2x 256 MiB (512 MiB) desc: 16x16 MiB
  Speed (MHz): avg: 1610 high: 3300 min/max: 1500/2300 boost: enabled
    scaling: driver: acpi-cpufreq governor: schedutil cores: 1: 2300 2: 1500
    3: 1500 4: 1500 5: 1500 6: 3300 7: 3300 8: 1500 9: 1500 10: 1500 11: 1500
    12: 1500 13: 1500 14: 1500 15: 3300 16: 2300 17: 1500 18: 1500 19: 1500
    20: 1500 21: 1500 22: 1500 23: 1500 24: 1500 25: 1500 26: 1500 27: 1500
    28: 1500 29: 1500 30: 1500 31: 1500 32: 1500 33: 1500 34: 1500 35: 1500
    36: 1500 37: 1500 38: 1500 39: 1500 40: 1500 41: 1500 42: 1500 43: 1500
    44: 1500 45: 1500 46: 1500 47: 1500 48: 1500 49: 1500 50: 1500 51: 1500
    52: 1500 53: 1500 54: 1500 55: 1500 56: 1500 57: 1500 58: 1467 59: 1500
    60: 3300 61: 3295 62: 1500 63: 1500 64: 1500 65: 1500 66: 1500 67: 1500
    68: 1500 69: 1500 70: 1500 71: 1500 72: 1500 73: 1500 74: 1500 75: 1500
    76: 1500 77: 1500 78: 1500 79: 1500 80: 1500 81: 1500 82: 1500 83: 1500
    84: 1500 85: 1500 86: 1500 87: 1500 88: 1500 89: 1500 90: 1500 91: 1500
    92: 1500 93: 1500 94: 1500 95: 1500 96: 1500 bogomips: 441875
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: retbleed mitigation: untrained return thunk; SMT disabled
  Type: spec_rstack_overflow mitigation: SMT disabled
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, STIBP:
    disabled, RSB filling, PBRSB-eIBRS: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: NVIDIA GV100 [TITAN V] driver: nvidia v: 545.29.06
    alternate: nouveau,nvidia_drm non-free: 545.xx+ status: current (as of
    2024-02; EOL~2026-12-xx) arch: Volta code: GV1xx process: TSMC 12nm
    built: 2017-2020 pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 01:00.0
    chip-ID: 10de:1d81 class-ID: 0300
  Device-2: NVIDIA GV100 [TITAN V] driver: nvidia v: 545.29.06
    alternate: nouveau,nvidia_drm non-free: 545.xx+ status: current (as of
    2024-02; EOL~2026-12-xx) arch: Volta code: GV1xx process: TSMC 12nm
    built: 2017-2020 pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 21:00.0
    chip-ID: 10de:1d81 class-ID: 0300
  Device-3: ASPEED Graphics Family vendor: Super Micro driver: ast v: kernel
    ports: active: VGA-1 empty: Virtual-1 bus-ID: 42:00.0 chip-ID: 1a03:2000
    class-ID: 0300
  Display: x11 server: X.org v: 1.21.1.11 compositor: xfwm4 v: 4.18.0 driver:
    X: loaded: modesetting,nvidia unloaded: nouveau alternate: fbdev,nv,vesa
    gpu: ast display-ID: :0.0 note: <missing: xdpyinfo/xrandr>
  Monitor-1: VGA-1 model: Dell U2412M serial: <filter> built: 2020
    res: 1920x1200 dpi: 94 gamma: 1.2 size: 518x324mm (20.39x12.76")
    diag: 611mm (24.1") ratio: 16:10 modes: max: 1920x1080 min: 640x480
  API: EGL v: 1.5 hw: drv: nvidia platforms: device: 0 drv: nvidia device: 1
    drv: nvidia device: 4 drv: swrast gbm: drv: kms_swrast surfaceless:
    drv: nvidia x11: drv: zink inactive: wayland,device-2,device-3
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: mesa v: 23.3.5-manjaro1.1
    glx-v: 1.4 direct-render: yes renderer: llvmpipe (LLVM 16.0.6 256 bits)
    device-ID: ffffffff:ffffffff memory: 245.83 GiB unified: yes
Audio:
  Device-1: NVIDIA driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s
    lanes: 16 bus-ID: 01:00.1 chip-ID: 10de:10f2 class-ID: 0403
  Device-2: NVIDIA driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s
    lanes: 16 bus-ID: 21:00.1 chip-ID: 10de:10f2 class-ID: 0403
  API: ALSA v: k6.6.16-2-MANJARO status: kernel-api with: aoss
    type: oss-emulator tools: alsactl,alsamixer,amixer
  Server-1: JACK v: 1.9.22 status: off tools: N/A
  Server-2: PipeWire v: 1.0.3 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel Ethernet X550 vendor: Super Micro driver: ixgbe v: kernel
    pcie: gen: 2 speed: 5 GT/s lanes: 8 port: N/A bus-ID: 61:00.0
    chip-ID: 8086:1563 class-ID: 0200
  IF: eno1 state: up speed: 1000 Mbps duplex: full mac: <filter>
  Device-2: Intel Ethernet X550 vendor: Super Micro driver: ixgbe v: kernel
    pcie: gen: 2 speed: 5 GT/s lanes: 8 port: N/A bus-ID: 61:00.1
    chip-ID: 8086:1563 class-ID: 0200
  IF: eno2 state: down mac: <filter>
  Info: services: NetworkManager
Drives:
  Local Storage: total: 1.85 TiB used: 22.35 GiB (1.2%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/sda maj-min: 8:0 vendor: Samsung model: SSD 870 EVO 2TB
    size: 1.82 TiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    tech: SSD serial: <filter> fw-rev: 3B6Q scheme: MBR
  ID-2: /dev/sdb maj-min: 8:16 vendor: SanDisk model: Cruzer Blade
    size: 29.25 GiB block-size: physical: 512 B logical: 512 B type: USB rev: 2.0
    spd: 480 Mb/s lanes: 1 mode: 2.0 tech: N/A serial: <filter> fw-rev: 1.00
    scheme: MBR
Partition:
  ID-1: / raw-size: 1.82 TiB size: 1.79 TiB (98.37%) used: 22.35 GiB (1.2%)
    fs: ext4 dev: /dev/dm-0 maj-min: 254:0
    mapped: luks-5ad75e2b-6b15-442b-a01e-12101132d822
Swap:
  Alert: No swap data was found.
Sensors:
  System Temperatures: cpu: 45.8 C mobo: N/A
  Fan Speeds (rpm): N/A
Info:
  Memory: total: 256 GiB note: est. available: 251.73 GiB used: 5.89 GiB (2.3%)
  Processes: 1392 Power: uptime: 4m states: freeze,mem,disk suspend: s2idle
    wakeups: 0 hibernate: shutdown avail: reboot,suspend,test_resume
    image: 100.68 GiB services: upowerd,xfce4-power-manager Init: systemd
    v: 255 default: graphical tool: systemctl
  Packages: pm: pacman pkgs: 1100 libs: 324 tools: pamac pm: flatpak pkgs: 0
    Compilers: clang: 16.0.6 gcc: 13.2.1 Shell: Bash v: 5.2.26
    running-in: xfce4-terminal inxi: 3.3.33

Thank you :slight_smile: - note, I have SMT disabled in bios (don’t want to use it)

Dayum.

I think you have enough ram.

So these speeds look incorrect to you?

The ‘high’ is correct (3300), but the average and min/max is throttled way down. The cpu speed would explain some of the slow python performance, but this is bad enough that it takes 3-4 times longer on this machine than on windows for the same code to run.

This motherboard has a bunch of northbridge settings all set to auto. I have a machine next to it dual xeon v3 which is also slow in running python on manjaro (h/t is disabled). Would really like python running much faster, as fast or faster than the windows machine.

:man_shrugging:

Difficult to say - some ideas

  • power profile
  • disable vulnerability mitigation
  • bottlenecks in script
  • run your script without X loaded (console)

Perhaps this will help.

https://wiki.archlinux.org/title/CPU_frequency_scaling

Are you sure that the CPU min/max is not correct? On AMD’s specification page it lists max all-core clocks of 2.3GHz and boost of 3.3GHz, which matches what you’ve posted (EPYC CPUs don’t tend to clock that high; server CPUs tend to value performance per watt over raw performance). I would also assume that a lower than expected clock speed would cause everything to run slow, not just Python.

Have you considered running pybench to get an idea of general Python performance? That might help narrow down if there’s a specific aspect of your script that is causing the problem.

1 Like

I’m not sure you want to be using cpufreq either.
https://wiki.archlinux.org/title/CPU_frequency_scaling#Scaling_drivers
For my (zen3) ryzen I use amd_pstate=active, which translates to amd-pstate-epp in use.

Thanks for replies. Will try them out and let you know how I go.

Out of curiosity I tested my system - using the pybench script mentioned above

My system is nowhere comparable to yours
System CPU RAM info

 $ inxi -SCm
System:
  Host: tiger Kernel: 6.6.18-1-MANJARO arch: x86_64 bits: 64
  Desktop: KDE Plasma v: 5.93.0 Distro: Manjaro Linux
Memory:
  System RAM: total: 64 GiB available: 62.62 GiB
    used: 3.37 GiB (5.4%)
  Message: For most reliable report, use superuser + dmidecode.
  Array-1: capacity: 1024 GiB note: check slots: 8 modules: 4
    EC: Multi-bit ECC
  Device-1: DIMM5 type: no module installed
  Device-2: DIMM6 type: no module installed
  Device-3: DIMM7 type: DDR4 size: 16 GiB speed: 3200 MT/s
  Device-4: DIMM8 type: DDR4 size: 16 GiB speed: 3200 MT/s
  Device-5: DIMM4 type: no module installed
  Device-6: DIMM3 type: no module installed
  Device-7: DIMM2 type: DDR4 size: 16 GiB speed: 3200 MT/s
  Device-8: DIMM1 type: DDR4 size: 16 GiB speed: 3200 MT/s
CPU:
  Info: 12-core model: AMD Ryzen Threadripper PRO 5945WX s bits: 64
    type: MT MCP cache: L2: 6 MiB
  Speed (MHz): avg: 514
    min/max: 400/4978:4565:4705:4841:5254:5118:5943:5666:5807:5530:5394
    cores: 1: 400 2: 400 3: 400 4: 400 5: 400 6: 400 7: 1769 8: 400
    9: 400 10: 400 11: 400 12: 400 13: 400 14: 400 15: 400 16: 400
    17: 400 18: 400 19: 400 20: 400 21: 400 22: 400 23: 400 24: 1785

Kernel cmdline

 $ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.6-x86_64 root=UUID=07c78795-e8a4-4134-be2e-be5908c5b9f8 rw quiet splash nowatchdog udev.log_priority=3 mitigations=off amd_pstate=active

Result

 $ python bench.py 
calculating pi:
100%|████████████████| 33554431/33554431 [00:05<00:00, 6151904.59it/s]

calculating fib recursive:
 97%|████████████████████████████████▏| 37/38 [00:09<00:00,  3.93it/s]

calculating fib iterative:
100%|███████████████████| 1048574/1048574 [00:05<00:00, 178670.19it/s]

benchmark time: 0:00:20.864621

If you run something like btop at the same time you will likely see the system is using only a few cores possilby only one.

This is your python script not uitlizing the potential of the system

I was wondering myself why I couldn’t utilize all cores calculating pi. I search and found an article you may also find interesting - How to Use 100% of All CPU Cores in Python - Super Fast Python

I have a mini pc with win11 and manjaro stable

System:
  Host: bmax4 Kernel: 6.6.16-2-MANJARO arch: x86_64 bits: 64
  Desktop: KDE Plasma v: 5.27.10 Distro: Manjaro Linux
Memory:
  System RAM: total: 16 GiB available: 15.31 GiB used: 2 GiB (13.1%)
  Array-1: capacity: 64 GiB slots: 2 modules: 1 EC: None
  Device-1: Controller0-ChannelA-DIMM0 type: DDR4 size: 16 GiB
    speed: 2667 MT/s
  Device-2: Controller1-ChannelA-DIMM0 type: no module installed
CPU:
  Info: quad core model: Intel N100 bits: 64 type: MCP cache: L2: 2 MiB
  Speed (MHz): avg: 703 min/max: 700/3400 cores: 1: 700 2: 708 3: 704 4: 703

minipc win11 - python 12

calculating pi:
100%|███████████████████████████████████████████████████████| 33554431/33554431 [00:10<00:00, 3253375.04it/s]
calculating fib recursive:
 97%|█████████████████████████████████████████████████████████████████████▊  | 37/38 [00:12<00:00,  2.96it/s]
calculating fib iterative:
100%|███████████████████████████████████████████████████████████| 1048574/1048574 [00:11<00:00, 95308.51it/s]
benchmark time: 0:00:33.981077

same pc - manjaro - python 11

calculating pi:
100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 33554431/33554431 [00:07<00:00, 4761923.08it/s]
calculating fib recursive:
 97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋   | 37/38 [00:08<00:00,  4.19it/s]
calculating fib iterative:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 1048574/1048574 [00:09<00:00, 105661.70it/s]
benchmark time: 0:00:25.827358

result ? :star_struck:


EDIT

and python 12 is faster than 11 ?
manjaro pyenv python 3.12.1

calculating pi:
100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 33554431/33554431 [00:11<00:00, 3026243.36it/s]

calculating fib recursive:
 97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋   | 37/38 [00:13<00:00,  2.64it/s]

calculating fib iterative:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 1048574/1048574 [00:09<00:00, 110279.49it/s]

benchmark time: 0:00:34.598512

:upside_down_face: :scream:

What’s not to like with Manjaro :slight_smile:

Even python is faster …

I think the fact that python doesn’t defauilt to use all core - makes it easy to compare systems - it also makes it clear that python is incredibly effective when utilised to its full potential.

System:
  Kernel: 6.6.16-2-MANJARO arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
    clocksource: tsc avail: hpet,acpi_pm
    parameters: BOOT_IMAGE=/boot/vmlinuz-6.6-x86_64
    root=UUID=6c3c70d5-994f-4aec-b639-c548602c205d rw nouveau.modeset=0 quiet
    cryptdevice=UUID=5ad75e2b-6b15-442b-a01e-12101132d822:luks-5ad75e2b-6b15-442b-a01e-12101132d822
    root=/dev/mapper/luks-5ad75e2b-6b15-442b-a01e-12101132d822 splash
    apparmor=1 security=apparmor udev.log_priority=3
  Desktop: Xfce v: 4.18.1 tk: Gtk v: 3.24.36 wm: xfwm4 v: 4.18.0
    with: xfce4-panel tools: xfce4-screensaver vt: 7 dm: LightDM v: 1.32.0
    Distro: Manjaro base: Arch Linux
Machine:
  Type: Server System: Supermicro product: Super Server v: 0123456789
    serial: <superuser required> Chassis: type: 17 v: 0123456789
    serial: <superuser required>
  Mobo: Supermicro model: H11DSi-NT v: 2.00 serial: <superuser required>
    uuid: <superuser required> UEFI-[Legacy]: American Megatrends v: 2.3
    date: 08/02/2021
CPU:
  Info: model: AMD EPYC 7642 bits: 64 type: MCP SMP arch: Zen 2 gen: 3
    level: v3 note: check built: 2020-22 process: TSMC n7 (7nm) family: 0x17 (23)
    model-id: 0x31 (49) stepping: 0 microcode: 0x830107B
  Topology: cpus: 2x cores: 48 smt: <unsupported> cache: L1: 2x 3 MiB (6 MiB)
    desc: d-48x32 KiB; i-48x32 KiB L2: 2x 24 MiB (48 MiB) desc: 48x512 KiB
    L3: 2x 256 MiB (512 MiB) desc: 16x16 MiB
  Speed (MHz): avg: 3223 high: 3276 min/max: 1500/2300 boost: enabled
    scaling: driver: acpi-cpufreq governor: schedutil cores: 1: 3275 2: 3265
    3: 3274 4: 3275 5: 3265 6: 3160 7: 3273 8: 3274 9: 3270 10: 3275 11: 3270
    12: 3274 13: 3276 14: 1454 15: 3274 16: 3274 17: 3276 18: 3275 19: 3275
    20: 3274 21: 3275 22: 3273 23: 3274 24: 3273 25: 3274 26: 3275 27: 3272
    28: 3270 29: 3275 30: 3272 31: 3275 32: 3265 33: 3264 34: 3275 35: 3275
    36: 3273 37: 3273 38: 3275 39: 3275 40: 3275 41: 2195 42: 3254 43: 3266
    44: 3275 45: 3268 46: 3268 47: 3267 48: 3273 49: 3273 50: 3273 51: 3274
    52: 3201 53: 3269 54: 3274 55: 3270 56: 3272 57: 3259 58: 3269 59: 3263
    60: 3261 61: 3260 62: 3241 63: 3255 64: 3275 65: 3264 66: 3250 67: 3274
    68: 3271 69: 3274 70: 3272 71: 3274 72: 3274 73: 3274 74: 3274 75: 3274
    76: 3274 77: 3274 78: 3271 79: 3274 80: 3271 81: 3265 82: 3274 83: 3274
    84: 3267 85: 3270 86: 3267 87: 3275 88: 3274 89: 3270 90: 3273 91: 1833
    92: 3268 93: 3273 94: 3274 95: 3275 96: 3272 bogomips: 441757
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: retbleed mitigation: untrained return thunk; SMT disabled
  Type: spec_rstack_overflow mitigation: SMT disabled
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, STIBP:
    disabled, RSB filling, PBRSB-eIBRS: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: NVIDIA GV100 [TITAN V] driver: nvidia v: 545.29.06
    alternate: nouveau,nvidia_drm non-free: 545.xx+ status: current (as of
    2024-02; EOL~2026-12-xx) arch: Volta code: GV1xx process: TSMC 12nm
    built: 2017-2020 pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 01:00.0
    chip-ID: 10de:1d81 class-ID: 0300
  Device-2: NVIDIA GV100 [TITAN V] driver: nvidia v: 545.29.06
    alternate: nouveau,nvidia_drm non-free: 545.xx+ status: current (as of
    2024-02; EOL~2026-12-xx) arch: Volta code: GV1xx process: TSMC 12nm
    built: 2017-2020 pcie: gen: 3 speed: 8 GT/s lanes: 16 bus-ID: 21:00.0
    chip-ID: 10de:1d81 class-ID: 0300
  Device-3: ASPEED Graphics Family vendor: Super Micro driver: ast v: kernel
    ports: active: VGA-1 empty: Virtual-1 bus-ID: 42:00.0 chip-ID: 1a03:2000
    class-ID: 0300
  Display: x11 server: X.org v: 1.21.1.11 compositor: xfwm4 v: 4.18.0 driver:
    X: loaded: modesetting,nvidia unloaded: nouveau alternate: fbdev,nv,vesa
    gpu: ast display-ID: :0.0 note: <missing: xdpyinfo/xrandr>
  Monitor-1: VGA-1 model: Dell U2412M serial: <filter> built: 2020
    res: 1920x1200 dpi: 94 gamma: 1.2 size: 518x324mm (20.39x12.76")
    diag: 611mm (24.1") ratio: 16:10 modes: max: 1920x1200 min: 640x480
  API: EGL v: 1.5 hw: drv: nvidia platforms: device: 0 drv: nvidia device: 1
    drv: nvidia device: 4 drv: swrast gbm: drv: kms_swrast surfaceless:
    drv: nvidia x11: drv: zink inactive: wayland,device-2,device-3
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: mesa v: 23.3.5-manjaro1.1
    glx-v: 1.4 direct-render: yes renderer: llvmpipe (LLVM 16.0.6 256 bits)
    device-ID: ffffffff:ffffffff memory: 245.83 GiB unified: yes
Audio:
  Device-1: NVIDIA driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s
    lanes: 16 bus-ID: 01:00.1 chip-ID: 10de:10f2 class-ID: 0403
  Device-2: NVIDIA driver: snd_hda_intel v: kernel pcie: gen: 3 speed: 8 GT/s
    lanes: 16 bus-ID: 21:00.1 chip-ID: 10de:10f2 class-ID: 0403
  API: ALSA v: k6.6.16-2-MANJARO status: kernel-api with: aoss
    type: oss-emulator tools: alsactl,alsamixer,amixer
  Server-1: JACK v: 1.9.22 status: off tools: N/A
  Server-2: PipeWire v: 1.0.3 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Intel Ethernet X550 vendor: Super Micro driver: ixgbe v: kernel
    pcie: gen: 2 speed: 5 GT/s lanes: 8 port: N/A bus-ID: 61:00.0
    chip-ID: 8086:1563 class-ID: 0200
  IF: eno1 state: down mac: <filter>
  Device-2: Intel Ethernet X550 vendor: Super Micro driver: ixgbe v: kernel
    pcie: gen: 2 speed: 5 GT/s lanes: 8 port: N/A bus-ID: 61:00.1
    chip-ID: 8086:1563 class-ID: 0200
  IF: eno2 state: up speed: 1000 Mbps duplex: full mac: <filter>
  Info: services: NetworkManager
Drives:
  Local Storage: total: 1.82 TiB used: 25.32 GiB (1.4%)
  SMART Message: Required tool smartctl not installed. Check --recommends
  ID-1: /dev/sda maj-min: 8:0 vendor: Samsung model: SSD 870 EVO 2TB
    size: 1.82 TiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    tech: SSD serial: <filter> fw-rev: 3B6Q scheme: MBR
Partition:
  ID-1: / raw-size: 1.82 TiB size: 1.79 TiB (98.37%) used: 25.32 GiB (1.4%)
    fs: ext4 dev: /dev/dm-0 maj-min: 254:0
    mapped: luks-5ad75e2b-6b15-442b-a01e-12101132d822
Swap:
  Alert: No swap data was found.
Sensors:
  System Temperatures: cpu: 51.8 C mobo: N/A
  Fan Speeds (rpm): N/A
Info:
  Memory: total: 256 GiB note: est. available: 251.73 GiB used: 7.36 GiB (2.9%)
  Processes: 2035 Power: uptime: 1d 8h 1m states: freeze,mem,disk
    suspend: s2idle wakeups: 0 hibernate: shutdown
    avail: reboot,suspend,test_resume image: 100.68 GiB
    services: upowerd,xfce4-power-manager Init: systemd v: 255
    default: graphical tool: systemctl
  Packages: pm: pacman pkgs: 1100 libs: 324 tools: pamac pm: flatpak pkgs: 0
    Compilers: clang: 16.0.6 gcc: 13.2.1 Shell: Bash v: 5.2.26
    running-in: xfce4-terminal inxi: 3.3.33

Running ‘inxi Fazy’ while running some 9 threaded number tests using a linux program, for me the CPU core speed is likely to be the culprit; so will try to figure it out and do some testing above to set a default higher clock speed. Thanks for the feedback :slight_smile:

If I’m reading that correctly, your system is clocking to the correct speed according to AMD’s specs. As I said, Epyc CPUs aren’t meant to be that fast; they’re meant to be power efficient. You can overclock, sure, but that’s an overclock.

That being said, have you looked at alternative versions of Python? If you’re only running pure Python code, there’s a good chance that the pypy version of Python could speed it up, as that’s a fair chunk faster. On average, pypy is about 5x faster, but that average hides some really monstrous speedups for some code.

Editting to add: It looks like the Manjaro packaged versions of pypy are currently a little broken, so you may have to download pypy from https://www.pypy.org/. They’re not completely broken - just some weirdness (I think someone might have compiled in some default code? Not entirely sure)

1 Like
calculating pi:
100%|██████████████████████████████████████████████| 33554431/33554431 [00:09<00:00, 3458013.93it/s]

calculating fib recursive:
 97%|█████████████████████████████████████████████████████████████▎ | 37/38 [00:18<00:00,  1.98it/s]

calculating fib iterative:
100%|█████████████████████████████████████████████████| 1048574/1048574 [00:10<00:00, 102368.05it/s]

benchmark time: 0:00:39.346528

OK so not great, but probably acceptable for the CPU rated speed. I think I understand what’s going on now - thanks all for the responses. This thing still thinks it’s a server. :slight_smile: The CCX cores are in groups of 3 with a lot of L3 cache, I have noticed quite big improvement going back to single core threads. Time to rethink how I manage my work load. Will need to understand the bios settings a lot better. Cheers.

{edit} an example single thread on this machine runs a large number (not-python) routine in 10 mins single thread, versus 30 mins multi thread, versus 50 mins on a windows box with a ryzen 9. so my thinking is too old school; need to understand thread handling better. more is not always better.

in summary, this is not a manjaro/python issue at all, and will investigate the pypy option if it can speed up execution times, but glad I asked the question. thanks for the help!!

This thing is quick!! :slight_smile:

Well, it is a server :slight_smile: And in fact, it’s a server with a rather complex architecture. Not only do you have multiple CCXs, but you’ve got two CPUs as well.

Definitely look into minimising your inter-thread communication, and if threads have to communicate a lot, locate them on at least the same CPU, if not the same CCX. Also I can’t help but notice you used the term threads for Python code - Python has a global interpreter lock, so only one thread can execute Python code at once (so some C extensions can run in parallel, but otherwise you get no parallelisation). You may want to look into multiprocessing to fix that.