Message boards : Questions and problems : hyperthreading and processor affinity in linux with taskset
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Oct 14 Posts: 2 ![]() |
tldr; If you want to force cpu affinity to prevent two logical cores from one physical from getting work at the same time: Set compute preference to 50% of CPUs. And run taskset -p bit_mask boinc-PID taskset -p 0x0000003f 1234 # use logical proc 0-5 or taskset -p 0x000007f0 1234 # use logical proc 6-11You will need to pause/restart jobs for the children to adhere to the mask. edit: Children only inherit this cpu mask when they are created. I just got lucky it seemed the first time. When attempting to toggle from cores 0-5 to 6-11, some of the processes stuck with the old affinity and ended up doubling up on some cores. Using taskset on the individual running tasks will immediately set the affinity. Below is a quick and dirty way I checked and set the child tasks cpu affinity. Adjust accordingly for i in $(ps -ae | grep seti | awk '{print $1}'); do taskset -p $i; done # Check affinity of seti tasks for i in $(ps -ae | grep seti | awk '{print $1}'); do taskset -p fc0 $i; done # set seti@home task affinity /edit: I've got my first hyperthreading CPU ( 5820K ) and I wondered about hyperthreading on vs off, read quite a few posts. Agreed with the consensus to leave it on and run one thread per logical core to keep things always busy. I've run BOINC tasks 24/7 on all my PCs for ~12 years. I have never been impacted by it running. Until now. I have noticed my PC getting sluggish and hanging up here and there for a few seconds on any user interaction. Unacceptable on a shiny new well spec'd PC. ( 32 GB DDR4/SSD/GTX970. Slackware 14.1 w up to date kernel ) So I returned to my thought that maybe I should limit processing to one thread of work per physical core, leaving the 2nd virtual core available for the system to possibly leverage a bit. Setting cpu usage to 50% alone still seems to throw workload on any core, and often enough still ends up doubling up on one physical core, and then leaving another physical core idle. I am also doing GPU work units. I have read several posts suggesting you get butter gpu throughput with fewer jobs than cores, as well as disabling hyperthreading all-together. So I set the processor affinity with taskset. The above example will set affinity to not only the boinc process but its -new- children as well. This particular bit mask sets the system to run on cores 0-5 ( and avoid 6-11 which are the respective 2nd cores for the same physical cores) Re-reading the config file, or running cpu benchmarks will suspend work, and then restart it, and all new jobs will be running on the cpus set by the mask. taskset manpage explains it all. I used "turbostat" in addition to plain ole "top" to verify cpu core workload distribution. I just stumbled onto taskset about an hour ago. So far its working OK. I even changed the mask on the fly to use the alternate cores ( 6-11) as jobs finish and new ones start, they shift over to the new cpu selection, or you can do something to stop and restart computation ( without stopping boinc itself ). I am also experimenting with telling boinc to use 60% cpu and run 7 jobs on the 6 cores it has allocated. That seems to keep the cpus at 100% more steadily. I have not done any overall workload throughput tests. Based on results of others that have, the semi-wild-ass-guess is I would get 10% more WUs done with all 12 hyperthread cores getting work. 10% bump, with a sluggish system is not a viable trade off. I did search, I did not see any posts like this. I hope it helps someone. Or alternately, if someone reads this and knows what else might be causing boinc work to bog down the system, please share. Other thoughts... The thought crossed my mind to alternate from logical cores 0-5 to 6-11, on maybe a daily or weekly basis to keep the duplicated bits equally worn out. lol. Im sure that is a very silly thought to have. ATM: running 7 jobs + GPU tasks, on 6 physical cores only is keeping the cpus at 100% I will step it down to 6 jobs and see if it looses steam. This might also vary depending on the project. I am running equal shares seti, milkyway, and weatherprediction.net, I also have a few spare Cosmology@home tasks that are finishing up. ( I am leaving that project and not accepting more work ) Cheers[/code] |
Send message Joined: 29 Oct 14 Posts: 2 ![]() |
I did a little more looking into what was causing my system to bog down. It was just my environment. I was storing my BOINC dir on my server. NFS over a 1 Gb link was never a problem for any work units until I started doing climateprediction tasks. I just looked at the size of the project folder. 9 GB.. Figured it must be doing crazy high IO and that was grinding things to a halt on my network, slowing other things down. I saw lots of iowait on the server. So I moved it to local storage ( SSD ). Then looked a little closer at it while it ran. Regularly hitting 100-200 MB/sec of disk IO ( over 1/2 second averages as viewed via iostat/iotop). So this project was IO bound. Kept my network and server overtasked. I initially had my perf monitoring logger polling at 2 mins, and it was masking high IO over the the longer average time. So the above nonsense about core affinity might not be really needed. Still running with it for now. |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.