Message boards : BOINC client : Scheduler weakness
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 Dec 05 Posts: 475 ![]() |
I run several projects, with Seti having the largest resource share. The main project I rely on for work when Seti is down is Einstein. Due to the problems Seti had a couple of months ago Einstein got a much bigger share than normal and therefore went severely into LTD. This Einstein debt rather than reducing has grown larger, because every time Seti has the slightest glitch, the scheduler decides it MUST get work from somewhere. As the other projects quite often have no work available it downloads more from Einstein. This wouldn't be so bad if it downloaded one unit per cpu but no 'IT' decides that because I run a one day cache 'IT' will down load enough work for at least 24 hours/cpu. This usually means 30+ hours of work (six S5R3 units). Because the resource share is set to be able to do one Einstein unit/cpu within the deadline, as it received three times more work than I want BOINC then goes into EDF mode and the LTD for Einstein keeps growing. Therefore we need a mechanism to only download units from a project in heavy LTD if the work on our computers is going to run out in the next few hours and only download the minimum of work, i.e one unit/cpu and no more. Further requests can be made if the difficulty continues. I am running BOINC 5.10.xx, connected 24/7. |
Send message Joined: 30 Dec 05 Posts: 475 ![]() |
The issue is that on one hand u want to have a 24 hour buffer (connect once per 24 hours), yet on the other hand BOINC never knowing if any project has work. It will simply follow your instruction and fill up that buffer. Now if Seti is Main and other projects are just fillers for idle time, set their resource share / weight very low. With BOINC 5.10.x you can approach the situation differently, but dont know how it will work out in your situation. That is what I do, connection/cache is exactly as you suggest, resource share for Einstein is 7% on Pent M and 4% on C2D which just allows enough time to do one unit/cpu in deadline without going into EDF. I have to assume it is a BOINC problem because the same pattern is happening on both computers. I could micro-manage mine the Pent M, but don't want to. The C2D is sons and is not always here. Andy |
Send message Joined: 9 Sep 05 Posts: 128 ![]() |
I have to assume it is a BOINC problem because the same pattern is happening on both computers. This is BOINC by design as I see it. And IMHO it's flawed. If I understand it right then BOINC client accumulates LTD when certain project is out of work/inaccessible/down for some time (among other reasons). IMHO this is wrong. I agree that BOINC client should accumulate (and spend) LTD if BOINC client runs into some kind of problems (eg. no internet connectivity, EDF, ...). However, I think that LTD shouldn't accumulate if there are problems on projects' side. If a project doesn't provide any work, then IMHO this project is actually voluntarily giving up it's resource share. Other project troubles (such as server crashes etc.) are not voluntary as such, however time to bring project up again is in project's staff hands (well, more or less). Hence the same reasoning: if project can't provide users with work, LTD should not build up. Distinction between client-side and project-side troubles is not trivial at all times. There is mechanism to check whether BOINC client is unable to contact project's server due to local or remote connection problems (everybody noticed that sometimes BOINC client connects to google) so this kind of distinction could be done. My humble opinion is that LTD should build up only if client is in EDF or if BOINC client can not fetch new work from any of attached projects. Metod ... ![]() |
Send message Joined: 19 Jan 07 Posts: 1179 ![]() |
I have to assume it is a BOINC problem because the same pattern is happening on both computers. See this post. |
Send message Joined: 9 Sep 05 Posts: 128 ![]() |
I have to assume it is a BOINC problem because the same pattern is happening on both computers. I volunteer my resources to BOINC projects and unlike some others I accept BOINC as is. This doesn't stop me from being annoyed about scheduler's behaviour though. :) Metod ... ![]() |
![]() Send message Joined: 29 Aug 05 Posts: 304 ![]() |
@Metod I may be reading your post wrong but I think the LTD works closer to what you are requesting than you seem to think. When a project is in a deferral for any reason (including NNW and suspensions) the LTD basically does not change. It changes when another project or projects is hogging the CPU or the queue on the host. There is some drift when all LTDs are normalized but it usually takes much longer to make a substantial difference. BOINC WIKI ![]() ![]() BOINCing since 2002/12/8 |
Send message Joined: 30 Dec 05 Posts: 475 ![]() |
It certainly takes much more time than one would orignally think to sort it self out. This problem I have with Einstein, large -ve LTD and still down loading too many units has been going on for at least three months, since Aug. And because it downloads too many units the LTD has increase from -200k to over -400k as it stands now. We have aborted all Einstein units on the C2D and hope the AP units from Seti Beta run ok and decrease the LTD that way. Andy |
Send message Joined: 9 Sep 05 Posts: 128 ![]() |
When a project is in a deferral for any reason (including NNW and suspensions) the LTD basically does not change. It changes when another project or projects is hogging the CPU or the queue on the host. There is some drift when all LTDs are normalized but it usually takes much longer to make a substantial difference. I agree with what you write. But: the problematic behaviour is when client can not connect project scheduler (eg. when project scheduler itself is down) ... in this case LTD builds up. As far as I can see this case (connection to project scheduler unsuccessful) is handled just the same way as the case when client doesn't request any work due to other reasons (eg. EDF, too low LTD, ...). Metod ... ![]() |
![]() Send message Joined: 29 Aug 05 Posts: 304 ![]() |
When a project is in a deferral for any reason (including NNW and suspensions) the LTD basically does not change. It changes when another project or projects is hogging the CPU or the queue on the host. There is some drift when all LTDs are normalized but it usually takes much longer to make a substantial difference. When the client can not get work it causes a deferral and the LTD does not change. If the client has a full queue when the deferral expires then the LTD will start moving again. BOINC WIKI ![]() ![]() BOINCing since 2002/12/8 |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.