Message boards : BOINC client : Benchmarking bug - indefinite suspension of computing
Message board moderation
Author | Message |
---|---|
Send message Joined: 5 Oct 06 Posts: 5144 ![]() |
Just come across this as a result of a problem-solving session at SETI - seems to be reproducible in current (v5.10.45) version for Windows. Scenario: BOINC running as a service on Windows XP. Do that typical end-user thing of using the system clock as a holiday planner (checking a date next month). Inadvertently click 'OK' instead of 'cancel' - sets the clock a month ahead. Some time later, you (or Internet Time) notice that the clock is wrong, and move it back to the correct month. BOINC computation stops with an endless benchmark shortly after the second time change. {Edit - opened trac ticket [trac]#588[/trac]}. You'll get a message log something like this: 2008-03-30 12:28:49 [Einstein@Home] Resuming task h1_0907.30_S5R3__166_S5R3b_1 using einstein_S5R3 version 436 2008-04-30 14:09:40 [---] Running CPU benchmarks 2008-04-30 14:09:40 [---] Suspending computation - running CPU benchmarks 2008-04-30 14:09:42 [---] [benchmark_debug] Starting floating-point benchmark 2008-04-30 14:09:52 [---] [benchmark_debug] Ended floating-point benchmark 2008-04-30 14:09:57 [---] [benchmark_debug] Starting integer benchmark 2008-04-30 14:10:07 [---] [benchmark_debug] Ended integer benchmark 2008-04-30 14:10:10 [---] [benchmark_debug] Ended benchmark 2008-04-30 14:10:11 [---] [benchmark_debug] CPU 0 has finished 2008-04-30 14:10:11 [---] [benchmark_debug] 1 out of 1 CPUs done 2008-04-30 14:10:11 [---] [benchmark_debug] CPU 0: fp 1038127090.301003 int 1675825412.162456 intloops 27696000.000000 inttime 9.406250 2008-04-30 14:10:11 [---] Benchmark results: 2008-04-30 14:10:11 [---] Number of CPUs: 1 2008-04-30 14:10:11 [---] 1038 floating point MIPS (Whetstone) per CPU 2008-04-30 14:10:11 [---] 1676 integer MIPS (Dhrystone) per CPU 2008-04-30 14:10:12 [---] Resuming computation 2008-03-30 14:13:41 [---] Running CPU benchmarks 2008-03-30 14:13:41 [---] Suspending computation - running CPU benchmarks 2008-03-30 14:17:24 [---] Exit requested by user To pause/resume tasks hit CTRL-C, to exit hit CTRL-BREAK StartServiceCtrlDispatcher being called. This may take several seconds. Please wait. 2008-03-30 14:17:26 [---] Starting BOINC client version 5.10.13 for windows_intelx86 2008-03-30 14:17:26 [---] log flags: task, file_xfer, sched_ops, benchmark_debug 2008-03-30 14:17:26 [---] Libraries: libcurl/7.16.1 OpenSSL/0.9.8e zlib/1.2.3 2008-03-30 14:17:26 [---] Executing as a daemon 2008-03-30 14:17:26 [---] Data directory: C:Program FilesBOINC 2008-03-30 14:17:26 [---] BOINC is running as a service and as a non-system user. 2008-03-30 14:17:26 [---] No application graphics will be available. 2008-03-30 14:17:27 [Einstein@Home] Found app_info.xml; using anonymous platform 2008-03-30 14:17:27 [SETI@home] Found app_info.xml; using anonymous platform 2008-03-30 14:17:27 [---] Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 2.00GHz [x86 Family 15 Model 2 Stepping 4] 2008-03-30 14:17:27 [---] Processor features: fpu tsc sse sse2 mmx 2008-03-30 14:17:27 [---] Memory: 511.30 MB physical, 1.22 GB virtual 2008-03-30 14:17:27 [---] Disk: 37.24 GB total, 4.92 GB free 2008-03-30 14:17:27 [Einstein@Home] URL: http://einstein.phys.uwm.edu/; Computer ID: 1036916; location: home; project prefs: default 2008-03-30 14:17:27 [SETI@home] URL: http://setiathome.berkeley.edu/; Computer ID: 1791152; location: work; project prefs: work 2008-03-30 14:17:27 [---] General prefs: from Einstein@Home (last modified 2007-12-07 10:01:47) 2008-03-30 14:17:27 [---] Host location: home 2008-03-30 14:17:27 [---] General prefs: using separate prefs for home 2008-03-30 14:17:27 [---] Preferences limit memory usage when active to 511.30MB 2008-03-30 14:17:27 [---] Preferences limit memory usage when idle to 511.30MB 2008-03-30 14:17:27 [---] Preferences limit disk usage to 4.92GB 2008-03-30 14:17:27 [---] Running CPU benchmarks 2008-03-30 14:17:30 [---] [benchmark_debug] Starting floating-point benchmark 2008-03-30 14:17:40 [---] [benchmark_debug] Ended floating-point benchmark 2008-03-30 14:17:46 [---] [benchmark_debug] Starting integer benchmark 2008-03-30 14:17:55 [---] [benchmark_debug] Ended integer benchmark 2008-03-30 14:17:59 [---] [benchmark_debug] Ended benchmark 2008-03-30 14:18:01 [---] [benchmark_debug] CPU 0 has finished 2008-03-30 14:18:01 [---] [benchmark_debug] 1 out of 1 CPUs done 2008-03-30 14:18:01 [---] [benchmark_debug] CPU 0: fp 1044264943.457189 int 1983678791.328194 intloops 29952000.000000 inttime 8.593750 2008-03-30 14:18:01 [---] Benchmark results: 2008-03-30 14:18:01 [---] Number of CPUs: 1 2008-03-30 14:18:01 [---] 1044 floating point MIPS (Whetstone) per CPU 2008-03-30 14:18:01 [---] 1984 integer MIPS (Dhrystone) per CPU 2008-03-30 14:18:09 [Einstein@Home] Restarting task h1_0907.30_S5R3__166_S5R3b_1 using einstein_S5R3 version 436 2008-03-30 14:21:45 [---] Exit requested by user StartServiceCtrlDispatcher being called. This may take several seconds. Please wait. 30-Mar-2008 14:22:49 [---] Starting BOINC client version 5.10.45 for windows_intelx86 30-Mar-2008 14:22:49 [---] log flags: task, file_xfer, sched_ops, benchmark_debug 30-Mar-2008 14:22:49 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8e zlib/1.2.3 30-Mar-2008 14:22:49 [---] Executing as a daemon 30-Mar-2008 14:22:49 [---] Data directory: C:Program FilesBOINC 30-Mar-2008 14:22:49 [---] BOINC is running as a service and as a non-system user. 30-Mar-2008 14:22:49 [---] No application graphics will be available. 30-Mar-2008 14:22:49 [Einstein@Home] Found app_info.xml; using anonymous platform 30-Mar-2008 14:22:49 [SETI@home] Found app_info.xml; using anonymous platform 30-Mar-2008 14:22:49 [---] Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 2.00GHz [x86 Family 15 Model 2 Stepping 4] 30-Mar-2008 14:22:49 [---] Processor features: fpu tsc sse sse2 mmx 30-Mar-2008 14:22:49 [---] OS: Microsoft Windows XP: Home Edition, Service Pack 2, (05.01.2600.00) 30-Mar-2008 14:22:49 [---] Memory: 511.30 MB physical, 1.22 GB virtual 30-Mar-2008 14:22:49 [---] Disk: 37.24 GB total, 4.83 GB free 30-Mar-2008 14:22:49 [---] Local time is UTC +1 hours 30-Mar-2008 14:22:49 [---] Version change (5.10.13 -> 5.10.45) 30-Mar-2008 14:22:49 [Einstein@Home] URL: http://einstein.phys.uwm.edu/; Computer ID: 1036916; location: home; project prefs: default 30-Mar-2008 14:22:49 [SETI@home] URL: http://setiathome.berkeley.edu/; Computer ID: 1791152; location: work; project prefs: work 30-Mar-2008 14:22:49 [---] General prefs: from Einstein@Home (last modified 07-Dec-2007 10:01:47) 30-Mar-2008 14:22:50 [---] Host location: home 30-Mar-2008 14:22:50 [---] General prefs: using separate prefs for home 30-Mar-2008 14:22:50 [---] Preferences limit memory usage when active to 511.30MB 30-Mar-2008 14:22:50 [---] Preferences limit memory usage when idle to 511.30MB 30-Mar-2008 14:22:50 [---] Preferences limit disk usage to 4.83GB 30-Mar-2008 14:22:50 [---] Running CPU benchmarks 30-Mar-2008 14:22:53 [---] [benchmark_debug] Starting floating-point benchmark 30-Mar-2008 14:23:02 [---] [benchmark_debug] Ended floating-point benchmark 30-Mar-2008 14:23:07 [---] [benchmark_debug] Starting integer benchmark 30-Mar-2008 14:23:17 [---] [benchmark_debug] Ended integer benchmark 30-Mar-2008 14:23:21 [---] [benchmark_debug] Ended benchmark 30-Mar-2008 14:23:23 [---] [benchmark_debug] CPU 0 has finished 30-Mar-2008 14:23:23 [---] [benchmark_debug] 1 out of 1 CPUs done 30-Mar-2008 14:23:23 [---] [benchmark_debug] CPU 0: fp 1030769230.769231 int 1932793606.913454 intloops 31200000.000000 inttime 9.187500 30-Mar-2008 14:23:23 [---] Benchmark results: 30-Mar-2008 14:23:23 [---] Number of CPUs: 1 30-Mar-2008 14:23:23 [---] 1031 floating point MIPS (Whetstone) per CPU 30-Mar-2008 14:23:23 [---] 1933 integer MIPS (Dhrystone) per CPU 30-Mar-2008 14:23:32 [Einstein@Home] Restarting task h1_0907.30_S5R3__166_S5R3b_1 using einstein_S5R3 version 436 30-Apr-2008 14:24:32 [---] Running CPU benchmarks 30-Apr-2008 14:24:32 [---] Suspending computation - running CPU benchmarks 30-Apr-2008 14:24:35 [---] [benchmark_debug] Starting floating-point benchmark 30-Apr-2008 14:25:08 [---] [benchmark_debug] Ended floating-point benchmark 30-Apr-2008 14:25:10 [---] [benchmark_debug] Starting integer benchmark 30-Apr-2008 14:25:12 [---] [benchmark_debug] Ended integer benchmark 30-Apr-2008 14:25:14 [---] [benchmark_debug] Ended benchmark 30-Apr-2008 14:25:16 [---] [benchmark_debug] CPU 0 has finished 30-Apr-2008 14:25:16 [---] [benchmark_debug] 1 out of 1 CPUs done 30-Apr-2008 14:25:16 [---] [benchmark_debug] CPU 0: fp 1032357177.148344 int 1912681740.570187 intloops 5776000.000000 inttime 1.718750 30-Apr-2008 14:25:16 [---] Benchmark results: 30-Apr-2008 14:25:16 [---] Number of CPUs: 1 30-Apr-2008 14:25:16 [---] 1032 floating point MIPS (Whetstone) per CPU 30-Apr-2008 14:25:16 [---] 1913 integer MIPS (Dhrystone) per CPU 30-Apr-2008 14:25:17 [---] Resuming computation 30-Mar-2008 14:27:00 [---] Running CPU benchmarks 30-Mar-2008 14:27:00 [---] Suspending computation - running CPU benchmarks 30-Mar-2008 14:30:04 [---] Exit requested by user StartServiceCtrlDispatcher being called. This may take several seconds. Please wait. 30-Mar-2008 14:30:06 [---] Starting BOINC client version 5.10.45 for windows_intelx86 30-Mar-2008 14:30:06 [---] log flags: task, file_xfer, sched_ops, benchmark_debug 30-Mar-2008 14:30:06 [---] Libraries: libcurl/7.18.0 OpenSSL/0.9.8e zlib/1.2.3 30-Mar-2008 14:30:06 [---] Executing as a daemon 30-Mar-2008 14:30:06 [---] Data directory: C:Program FilesBOINC 30-Mar-2008 14:30:06 [---] BOINC is running as a service and as a non-system user. 30-Mar-2008 14:30:06 [---] No application graphics will be available. 30-Mar-2008 14:30:06 [Einstein@Home] Found app_info.xml; using anonymous platform 30-Mar-2008 14:30:06 [SETI@home] Found app_info.xml; using anonymous platform 30-Mar-2008 14:30:07 [---] Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 2.00GHz [x86 Family 15 Model 2 Stepping 4] 30-Mar-2008 14:30:07 [---] Processor features: fpu tsc sse sse2 mmx 30-Mar-2008 14:30:07 [---] OS: Microsoft Windows XP: Home Edition, Service Pack 2, (05.01.2600.00) 30-Mar-2008 14:30:07 [---] Memory: 511.30 MB physical, 1.22 GB virtual 30-Mar-2008 14:30:07 [---] Disk: 37.24 GB total, 4.84 GB free 30-Mar-2008 14:30:07 [---] Local time is UTC +1 hours 30-Mar-2008 14:30:07 [Einstein@Home] URL: http://einstein.phys.uwm.edu/; Computer ID: 1036916; location: home; project prefs: default 30-Mar-2008 14:30:07 [SETI@home] URL: http://setiathome.berkeley.edu/; Computer ID: 1791152; location: work; project prefs: work 30-Mar-2008 14:30:07 [---] General prefs: from Einstein@Home (last modified 07-Dec-2007 10:01:47) 30-Mar-2008 14:30:07 [---] Host location: home 30-Mar-2008 14:30:07 [---] General prefs: using separate prefs for home 30-Mar-2008 14:30:07 [---] Preferences limit memory usage when active to 511.30MB 30-Mar-2008 14:30:07 [---] Preferences limit memory usage when idle to 511.30MB 30-Mar-2008 14:30:07 [---] Preferences limit disk usage to 4.84GB 30-Mar-2008 14:30:07 [---] Running CPU benchmarks 30-Mar-2008 14:30:10 [---] [benchmark_debug] Starting floating-point benchmark 30-Mar-2008 14:30:20 [---] [benchmark_debug] Ended floating-point benchmark 30-Mar-2008 14:30:26 [---] [benchmark_debug] Starting integer benchmark 30-Mar-2008 14:30:36 [---] [benchmark_debug] Ended integer benchmark 30-Mar-2008 14:30:38 [---] [benchmark_debug] Ended benchmark 30-Mar-2008 14:30:40 [---] [benchmark_debug] CPU 0 has finished 30-Mar-2008 14:30:40 [---] [benchmark_debug] 1 out of 1 CPUs done 30-Mar-2008 14:30:40 [---] [benchmark_debug] CPU 0: fp 1029145728.643216 int 1641931187.505236 intloops 26640000.000000 inttime 9.234375 30-Mar-2008 14:30:40 [---] Benchmark results: 30-Mar-2008 14:30:40 [---] Number of CPUs: 1 30-Mar-2008 14:30:40 [---] 1029 floating point MIPS (Whetstone) per CPU 30-Mar-2008 14:30:40 [---] 1642 integer MIPS (Dhrystone) per CPU 30-Mar-2008 14:30:43 [Einstein@Home] Restarting task h1_0907.30_S5R3__166_S5R3b_1 using einstein_S5R3 version 436 |
Send message Joined: 5 Oct 06 Posts: 5144 ![]() |
Use a utility like LClock (LonghornClock) and you have a clock in place of the regular with a calendar function and sans risk of changing the real clock inadvertantly. It's an additional menu option to get thru or 1 Click, Calendar, double clock, the time setting applet. Been using it now on XP for a few years. Yes, yes, yes, but....... Can you force every Windows XP user to install a read-only clock/calendar? (At least this problem will go away when Vista is universal on the desktop). And can you train every Windows XP user to cancel out of the existing clock, every time they check a date? (When I hit this problem in the real world, with a telesales database, the problem went away when we upgraded from Windows 98 to domain-controlled Windows 2000 for the sales floor, and gave the users restricted rights. But the boss's orders were still all over the place, because his logon had to have administrative rights and could change the clock. But I digress). The BUG is that benchmarking starts, but doesn't complete. There should be no way that that can happen, period. It's called fault-tolerance, and it should apply even when the 'fault' is a naïve user. |
Send message Joined: 5 Oct 06 Posts: 5144 ![]() |
Just offered a temporary solution. You can do what you please with it. That's fine. We solved the problem brought to us by the original SETI user. The logs I posted from my own machine were self-inflcted in the interests of research. As an amateur programmer myself, I know all the mantras which programmers utter when a bug report comes in, and I've used most of them myself. "Is it reproducible?" "What did the message actually say?" "Are you using the latest version?" "What were you doing at the time?" "Is your antivirus up-to-date?" "Have you applied the latest service pack?" "Oh no, our program could never do that." (often false) "Is your video/printer driver up-to-date?" "Why on earth do you want to do that?" etc. etc. I just think that it's grown-up and responsible to try to work through as many as possible of them before I open a trac ticket. |
![]() Send message Joined: 29 Aug 05 Posts: 147 |
Hmm. This is not the first problem with clocks changing causing problems. I know that under windows there is a function "GetTickCount" that returns the number of ticks since the machine started rather than the absolute time. ![]() BOINC WIKI |
Send message Joined: 8 Jan 06 Posts: 36 ![]() |
It's not just the benchmark which has problems if you go farther than the next run time for it in the future and then back. Everything stops for whatever the time interval was for the jump when you go back. The only exception found was that OS initiated DST changes are handled properly. IOW's, patching it so the benchmark completes once it starts regardless of anything else that happens with the clock won't fix the problem enitrely. In addition, my observations when I was working the problem with a different user over in SAH is the 'damage' to the time metrics is from the leap forward which created a big seemingly idle period BOINC cannot account for. Fortunately, when the jump back occurs it doesn't interpret this as miraculously somehow having amplified its computational abilities and set the metrics according to that! That probably explains why it resorts to just suspending everything until 'mystery' period goes away. Alinator |
Send message Joined: 5 Oct 06 Posts: 5144 ![]() |
Also noted. Since performing self-sacrifice in the name of science, I'm getting <active_frac>0.045279</active_frac> This machine runs 24/7/365. |
Send message Joined: 30 Dec 05 Posts: 475 ![]() |
Does this http://support.microsoft.com/kb/307897 help. And could it be incorporated into BOINC so that client is in time sync with project servers. |
Send message Joined: 19 Jan 07 Posts: 1179 ![]() |
Does this http://support.microsoft.com/kb/307897 help. And could it be incorporated into BOINC so that client is in time sync with project servers. That article is about synchronizing the system clock. BOINC doesn't and shouldn't have permissions to change the system-wide clock. |
Send message Joined: 30 Dec 05 Posts: 475 ![]() |
Does this http://support.microsoft.com/kb/307897 help. And could it be incorporated into BOINC so that client is in time sync with project servers. I can see why in some situations it should not be tolerated, but why not have an option to sync clock to project clock. Win XP has an option, bring up clock date/time properties and go to internet time tab. But this option only does it once a week or manually and will not adjust if greater than 15hrs wrong. |
Send message Joined: 16 Apr 06 Posts: 386 ![]() |
Basically your suggestion is: * Store a time-offset for each project in client_state.xml (i.e., offset needed from system time to match project time) * Periodically, and also whenever the system clock changes signficantly, Update the offset from a project ntp server. |
Send message Joined: 30 Dec 05 Posts: 475 ![]() |
Yes, to be blunt. There is real need for times in BOINC manager to be anything other than UTC, except for the saving of files in local time. All the web pages run on UTC only. And me being me, and had 25years in worldwide military comms, running everything in Zulu time is second nature, so If BOINC went UTC everywhere, I would turn off the switch to BST, and run UTC all year. |
Send message Joined: 5 Oct 06 Posts: 5144 ![]() |
Guys, guys, .... There are going to be problems with time recording for as long as we have clocks and different timezones. We can argue till the cows come home (and what time is that? Do cows know about DST?) whether it is the duty of BOINC and other software to ignore, correct, notify or work round such errors as it finds. But while - or preferably before - we discuss all these arcana, there's a bug to be fixed. It seems that, under certain circumstances as documented in my trac ticket, BOINC just STOPS. Period. That isn't on my list of acceptable responses to a time glitch. It appeares that there's a code sequence along the lines of IF <weird time detected> SUSPEND computing for benchmarking WAIT [color=red]<for something that isn't going to happen>[/color] START <floating point benchmark> I'm wondering if this is something that was introduced round about v5.8.16: - core client: if benchmark time is in the future (due to user tweak) always run benchmarks - if it came in with a fix like that, it might explain why it was missed in pre-release testing. Anyone got a better idea? |
Send message Joined: 16 Apr 06 Posts: 386 ![]() |
... But that's exactly what we were talking about. If the time checking used in this benchmark processing used the offset+systemclock, rather than just system-clock on it's own, then the bug would be solved.
It'd be a significant design change, and everywhere which referred to time would need to refer to the adjusted time instead. If the benchmark bug was fixed in isolation, you'd then get the remaining problems which are just as bad - the DCF goes haywire, and processing gets suspended for a month or whatever, which we've seen several times before with older versions. The offset would solve these other bugs simultaneously. |
Send message Joined: 19 Jan 07 Posts: 1179 ![]() |
Does this http://support.microsoft.com/kb/307897 help. And could it be incorporated into BOINC so that client is in time sync with project servers. Because it's none of BOINC's business to keep the system clock correct, and because NTP servers are way more accurate than project servers. Since version 6, BOINC will get installed under its own account with very low privileges, definitely not enough to change the clock (which is an admin privilege). |
![]() Send message Joined: 29 Aug 05 Posts: 147 |
Does this http://support.microsoft.com/kb/307897 help. And could it be incorporated into BOINC so that client is in time sync with project servers. Actually, under windows, BOINC probably actually DOES have the permissions to set the system clock. Not that it has any business doing so. ![]() BOINC WIKI |
Send message Joined: 5 Oct 06 Posts: 5144 ![]() |
.... and because NTP servers are way more accurate than project servers. Which raises another question: why would anyone run a BOINC server which isn't set up for automatic synchronisation with an NTP server? Matt Lebofsky at SETI occasionally forgets when he's setting up a new web server made up of cannibalised/donated parts, and it soon becomes obvious - the latest post on a message board is some time in the future (SETI has separate servers for the database and the web front end). But he always corrects it as soon as he notices or I point it out ;-) |
Send message Joined: 30 Dec 05 Posts: 475 ![]() |
Considering the results the results of the BOINC survey. Most computers are at home. 90%+ run Windows, and if enabled the clock is only checked against an NTP server once a week. A lot are on 24/7. Therefore by default, even if BOINC projects was not the original objective, BOINC is the computer primary function. Therefore why not let them use a BOINC project to check, and if desired reset, the clock. My sons old P4 computer had a regular habit of corrupting its BIOS, luckily it was one with backup copy in ROM. But when it did this it reverted to the default settings, date 01/01/1980. And if set to re-boot on resumption of power, after a power break, it could easily go unnoticed for days. The clock is not that important to a teenager playing games, but it is to the BOINC client. |
Send message Joined: 5 Oct 06 Posts: 5144 ![]() |
Another selling point for BOINC! Keeps your computer running smoothly and safely. (because, IIRC, Windows update fails if the local computer clock is sufficiently different from the Microsoft servers' estimation of time). |
Send message Joined: 30 Dec 05 Posts: 475 ![]() |
Therefore by default, even if BOINC projects was not the original objective, BOINC is the computer primary function. Therefore why not let them use a BOINC project to check, and if desired reset, the clock. I agree fixing this problem may be the best option for this problem, but having the computer clock wrong also affects the scheduler, and therefore having the ability to identify the clock is wrong and either posting a warning, or having an option to allow the clock to be corrected automatically would be a beneficial. |
![]() Send message Joined: 3 Apr 06 Posts: 547 ![]() |
Although it would take longer to implement, the better (more reliable) and politically safer fix is the other fix that has been suggested in this thread wherein BOINC, if I understand correctly, would keep its own "clock" and leave the system clock alone. I suppose that's what is being referred to as a monotonical clock. I suspect that only the OS is capable of delivering such clock values correctly (be it e.g. the GetTickCount() mentioned by JM7, translated to a date/time value, keeping in mind e.g. hibernation times etc.) Peter |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.