I run a Python 3 program on my manjaro to process big text files.
The CPU and memory usage is too low, and the runtime is too long.
How can I make full use of the CPU and memory?
CPU 1:100% CPU 2-6:less than 10%
Memory: 3.2GB of 23.3 GB
I run a Python 3 program on my manjaro to process big text files.
The CPU and memory usage is too low, and the runtime is too long.
How can I make full use of the CPU and memory?
CPU 1:100% CPU 2-6:less than 10%
Memory: 3.2GB of 23.3 GB
Hi @Alston, and welcome!
I strongly suspect that what you’re looking at is multi-threading and is the responsibility, if you can call it that, of the script. A.k,a. it has to be put in by the programmer.
Python program executes on single CPU core, even more than one core is present, no matter if you’re using multithreading or not due to GIL (GlobalInterpreterLock - Python Wiki).
Interesting. Don’t use Python, so didn’t know that. But, that would explain it then.
That is to say, no way to improve CPU usage?
And how about memory usage?
Then which language would be faster ?
According to this answer on Stack exchange:
The answer is “Yes, But…”
But cPython cannot when you are using regular threads for concurrency.
You can either use something like
multiprocessing
,celery
ormpi4py
to split the parallel work into another process;Or you can use something like Jython or IronPython to use an alternative interpreter that doesn’t have a GIL.
A softer solution is to use libraries that don’t run afoul of the GIL for heavy CPU tasks, for instance
numpy
can do the heavy lifting while not retaining the GIL, so other python threads can proceed. You can also use thectypes
library in this way.If you are not doing CPU bound work, you can ignore the GIL issue entirely (kind of) since python won’t aquire the GIL while it’s waiting for IO.
So, it can be done. I think. I don’t know which language would be best, though.
Edit:
There is also this to look at, but, once again, I’m not a python programmer, that’s just what I found on Google.
Memory usage can always be reduced by smart programming, using python features as generators, iterators and optimizing code, but all depends of what and how your program work.
If you need to write a serious solution using python, I will split the program in two main parts: main program which handle the input, and workers as separate processes, controlled by main master. Main master would create separate tasks and send to workers, wait until they finish and send the next task. There a number of libs for IPC (inter-process communication) which could be used.
If you’re not bound to python, I would pick Go as the language of choice for such task since it’s much faster due to compilation to native binary and more advanced utilizing all CPU cores.