Message boards : BOINC client : Task exited with zero status but no 'finished' file
Message board moderation
Author | Message |
---|---|
Send message Joined: 13 Dec 08 Posts: 4 ![]() |
I've been getting this problem with BOINC v6.2.19. I'm getting the "no heartbeat" error in stderr.txt. It has been down to my PC's RTC loosing time. The current BOINC task gets reset to 0% when I either manually sync (using a 3rd party time sync tool) and the RTC has lost more than 30 sec., or every x hours and 58 minutes wether I am connected to the internet or not, where x = an integer starting at 0. Note, I have the Windows automatic time sync turned off. I've tried v6.4.5 and it has the same problem. I remember seeing on a forum thread that there was a plan to base the heatbeat measurement on interrupts rather than the PC's clock, which would get around this problem. Does anyone know if that will be the case, and when it will be introduced? |
![]() Send message Joined: 29 Aug 05 Posts: 15585 ![]() |
You may want to look at this FAQ for clues on where the problem comes from. In the mean time, do not use a 6.4 version of BOINC unless you want to test the CUDA option. It's far from stable. |
Send message Joined: 13 Dec 08 Posts: 4 ![]() |
Thanks for the reply. I uninstalled v6.4.5 and reinstalled v6.2.19. I'm sure that in my case it's due to the PC clock loosing time, then jumping more than 30 sec. after a manual sync. that triggers the fault. After I reinstalled v6.2.19, I allowed BOINC to perform benchmarks and restart computation. Then I synced and that caused the usual plus an intersting extra benchmarking fault (note how the time jumps back just after the first "restarting" message when I synced): 13/12/2008 2:02:42 PM||Starting BOINC client version 6.2.19 for windows_intelx86 13/12/2008 2:02:42 PM||log flags: task, file_xfer, sched_ops 13/12/2008 2:02:42 PM||Libraries: libcurl/7.18.0 OpenSSL/0.9.8e zlib/1.2.3 13/12/2008 2:02:42 PM||Running as a daemon 13/12/2008 2:02:42 PM||Data directory: C:\BOINC\Data 13/12/2008 2:02:42 PM||Running under account boinc_master 13/12/2008 2:02:42 PM||Processor: 1 AuthenticAMD AMD Athlon(TM) XP 2600+ [x86 Family 6 Model 10 Stepping 0] 13/12/2008 2:02:42 PM||Processor features: fpu tsc sse 3dnow mmx 13/12/2008 2:02:42 PM||OS: Microsoft Windows XP: Home x86 Editon, Service Pack 3, (05.01.2600.00) 13/12/2008 2:02:42 PM||Memory: 767.35 MB physical, 1.83 GB virtual 13/12/2008 2:02:42 PM||Disk: 74.49 GB total, 17.37 GB free 13/12/2008 2:02:42 PM||Local time is UTC +0 hours 13/12/2008 2:02:42 PM||Version change (6.4.5 -> 6.2.19) 13/12/2008 2:02:42 PM|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 4173368; location: home; project prefs: home 13/12/2008 2:02:42 PM||General prefs: from SETI@home (last modified 08-Dec-2008 20:26:44) 13/12/2008 2:02:42 PM||Computer location: home 13/12/2008 2:02:42 PM||General prefs: using separate prefs for home 13/12/2008 2:02:42 PM||Preferences limit memory usage when active to 383.67MB 13/12/2008 2:02:42 PM||Preferences limit memory usage when idle to 690.61MB 13/12/2008 2:02:42 PM||Preferences limit disk usage to 0.93GB 13/12/2008 2:02:42 PM||Running CPU benchmarks 13/12/2008 2:02:42 PM||Suspending network activity - time of day 13/12/2008 2:03:13 PM||Benchmark results: 13/12/2008 2:03:13 PM|| Number of CPUs: 1 13/12/2008 2:03:13 PM|| 1531 floating point MIPS (Whetstone) per CPU 13/12/2008 2:03:13 PM|| 2541 integer MIPS (Dhrystone) per CPU 13/12/2008 2:03:15 PM|SETI@home|Restarting task 21oc08ae.30412.5798.8.8.188_0 using setiathome_enhanced version 603 13/12/2008 2:02:09 PM||Running CPU benchmarks 13/12/2008 2:02:09 PM||Suspending computation - running CPU benchmarks 13/12/2008 2:03:20 PM||[error] FP benchmark ran only 0.953125 sec; ignoring 13/12/2008 2:03:20 PM||[error] CPU benchmarks error 13/12/2008 2:03:21 PM||Resuming computation 13/12/2008 2:03:26 PM|SETI@home|Task 21oc08ae.30412.5798.8.8.188_0 exited with zero status but no 'finished' file 13/12/2008 2:03:26 PM|SETI@home|If this happens repeatedly you may need to reset the project. 13/12/2008 2:03:26 PM|SETI@home|Restarting task 21oc08ae.30412.5798.8.8.188_0 using setiathome_enhanced version 603 Plus in stderr.txt: Work Unit Info: ............... WU true angle range is : 0.440438 Optimal function choices: ----------------------------------------------------- name ----------------------------------------------------- v_BaseLineSmooth (no other) v_vGetPowerSpectrum 0.00217 0.00000 sse1_ChirpData_ak 0.02845 0.00000 No heartbeat from core client for 30 sec - exiting setiathome_enhanced 6.02 DevC++/MinGW libboinc: 6.3.6 I look forward to a non-PC RTC heartbeat solution being coded into BOINC :-) |
Send message Joined: 13 Dec 08 Posts: 4 ![]() |
Maybe you just need to replace the battery that powers the clock. Yep, I've tried that when this first started happening. It didn't make any difference. I think it's software related, since after some reboot's I don't have the PC-clock-loosing-time problem. I've virus checked, spyware checked, and seti (in my case) is the only thing consuming significant processor time. I've tried setting CPU usage to 90% in the online preferences, but that didn't help. Actually, I was suprised to see that changing to 90% CPU means that the process consumes 100% for 90% of the time, and 0% for 10% of the time. i.e. it's not a constant 90%. But you're right, in that I really need to sort my PC-clock-loosing-time problem. But this isn't the place to address that, I guess. It's only that the "Task exited with zero status but no 'finished' file" has been around for a while, and on the BOINC Wiki it says: "One of the causes of this message seems to be the setting of the computer's clock. When the time is adjusted the BOINC Daemon and the Science Application seem to get out of step. This should be fixed in the 4.7x/5.0.x release of the BOINC Client Software." (Link here) And we're now at 6.2.x. OK, that is the unofficial BOINC Wiki, and most peoples PC clocks will be accurate enough (i.e. within +/-30 sec., which is a large error anyway) for this not to be a problem, so it's not going to be a priority. |
Send message Joined: 19 Jan 07 Posts: 1179 ![]() |
In many operating systems, while the computer is running, the clock is kept up-to-date separately from the "hardware clock" (the battery-backed one). |
Send message Joined: 13 Dec 08 Posts: 4 ![]() |
OK, I need to admit something I've only just noticed. I'm using a very handy freeware time synchronisation tool called Karen's Time Sync. When I've been syncing my PC clock, it's been giving me a negative time difference which I assumed meant the amount of time my PC clock was behind the online clock. Now I actually have read it more closely, the time adjustment is "Difference/Adjustment to our clock". In other words, time is being subtracted from my PC clock - my PC clock is gaining time. I rebooted my PC shortly after my last post and found my PC clock was maintaining accuracy (e.g. after about 18 hours, the error was 0.2 sec). Dagorath said: You need to make a list of all the possible reasons then eliminate them 1 by 1. Well, I started running suspect programs that I thought might cause this problem, but the clock maintained perfect accuracy. I fiddled with the time, and tried rebooting, and just pressing the reset button, and now, after another reboot it's gaining time like the clappers again - approx. 8 seconds every 4 minutes. Nicolas said: In many operating systems, while the computer is running, the clock is kept up-to-date separately from the "hardware clock" (the battery-backed one). Now that was something I didn't know. I found an open source program called ClockMon which shows the difference between the hardware clock and the operating system clock. Both ClockMon and Karen's Time Sync allow you to measure the difference between clocks without actually synchronising them with anything. And guess what? The difference between the hardware clock and the operating system clock is the same as the difference between the operating system clock and the online clock. Or in other words, the hardware clock is in sync with the online clock and maintaing time correctly - it's the operating system clock that's wrong and gaining time. Elsewhere I have read that at boot time the operating system clock sync's itself with the hardware clock, then the operating system clock is maintained by interrupts generated by the motherboard bus "clock", while the hardware clock runs from a separate crystal on its own oscillator. I can understand how software could cause the operating system clock to loose time by making it miss interupts from the motherboard bus, but how could it ever gain time? I have two workarounds: 1. reboot repeatedly until the clocks stay in sync., or 2. use ClockMon to keep my clocks within 30 seconds (I disconnect my internet connection, so can't use an internet based sync tool). Based on my approximate measurement above, I need to sync the operating system clock with the hardware clock every 60 minutes to not exceed a 30 second difference. Which is interesting, because from my first post I found that worst case BOINC lost the "heartbeat" and reset work units every 58 minutes. Thanks for your help guys :-) |
Send message Joined: 19 Jan 07 Posts: 1179 ![]() |
Elsewhere I have read that at boot time the operating system clock sync's itself with the hardware clock, then the operating system clock is maintained by interrupts generated by the motherboard bus "clock", while the hardware clock runs from a separate crystal on its own oscillator. And on Linux, it gets saved back to the hardware clock on shutdown. I'm not sure how it works on Windows; maybe it saves it whenever you change it. I once had problems when trying to workaround DST issues. I changed my system clock, hours later I had a power outage, and when it booted again, I got the old time back. The hardware clock hadn't got updated, because it updates on a *clean* shutdown, obviously not on a power outage! I can understand how software could cause the operating system clock to loose time by making it miss interupts from the motherboard bus, but how could it ever gain time? If you were losing time because it misses interrupts, you could be having lots of other problems anyway... |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.