Thread 'Task exited with zero status but no 'finished' file'

Message boards : BOINC client : Task exited with zero status but no 'finished' file
Message board moderation

To post messages, you must log in.

AuthorMessage
Mike

Send message
Joined: 13 Dec 08
Posts: 4
United Kingdom
Message 21765 - Posted: 13 Dec 2008, 9:36:26 UTC

I've been getting this problem with BOINC v6.2.19. I'm getting the "no heartbeat" error in stderr.txt. It has been down to my PC's RTC loosing time. The current BOINC task gets reset to 0% when I either manually sync (using a 3rd party time sync tool) and the RTC has lost more than 30 sec., or every x hours and 58 minutes wether I am connected to the internet or not, where x = an integer starting at 0. Note, I have the Windows automatic time sync turned off.

I've tried v6.4.5 and it has the same problem.

I remember seeing on a forum thread that there was a plan to base the heatbeat measurement on interrupts rather than the PC's clock, which would get around this problem. Does anyone know if that will be the case, and when it will be introduced?
ID: 21765 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15585
Netherlands
Message 21766 - Posted: 13 Dec 2008, 9:56:45 UTC - in response to Message 21765.  

You may want to look at this FAQ for clues on where the problem comes from.

In the mean time, do not use a 6.4 version of BOINC unless you want to test the CUDA option. It's far from stable.
ID: 21766 · Report as offensive
Mike

Send message
Joined: 13 Dec 08
Posts: 4
United Kingdom
Message 21769 - Posted: 13 Dec 2008, 14:58:42 UTC - in response to Message 21766.  

Thanks for the reply. I uninstalled v6.4.5 and reinstalled v6.2.19. I'm sure that in my case it's due to the PC clock loosing time, then jumping more than 30 sec. after a manual sync. that triggers the fault. After I reinstalled v6.2.19, I allowed BOINC to perform benchmarks and restart computation. Then I synced and that caused the usual plus an intersting extra benchmarking fault (note how the time jumps back just after the first "restarting" message when I synced):

13/12/2008 2:02:42 PM||Starting BOINC client version 6.2.19 for windows_intelx86
13/12/2008 2:02:42 PM||log flags: task, file_xfer, sched_ops
13/12/2008 2:02:42 PM||Libraries: libcurl/7.18.0 OpenSSL/0.9.8e zlib/1.2.3
13/12/2008 2:02:42 PM||Running as a daemon
13/12/2008 2:02:42 PM||Data directory: C:\BOINC\Data
13/12/2008 2:02:42 PM||Running under account boinc_master
13/12/2008 2:02:42 PM||Processor: 1 AuthenticAMD AMD Athlon(TM) XP 2600+ [x86 Family 6 Model 10 Stepping 0]
13/12/2008 2:02:42 PM||Processor features: fpu tsc sse 3dnow mmx
13/12/2008 2:02:42 PM||OS: Microsoft Windows XP: Home x86 Editon, Service Pack 3, (05.01.2600.00)
13/12/2008 2:02:42 PM||Memory: 767.35 MB physical, 1.83 GB virtual
13/12/2008 2:02:42 PM||Disk: 74.49 GB total, 17.37 GB free
13/12/2008 2:02:42 PM||Local time is UTC +0 hours
13/12/2008 2:02:42 PM||Version change (6.4.5 -> 6.2.19)
13/12/2008 2:02:42 PM|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 4173368; location: home; project prefs: home
13/12/2008 2:02:42 PM||General prefs: from SETI@home (last modified 08-Dec-2008 20:26:44)
13/12/2008 2:02:42 PM||Computer location: home
13/12/2008 2:02:42 PM||General prefs: using separate prefs for home
13/12/2008 2:02:42 PM||Preferences limit memory usage when active to 383.67MB
13/12/2008 2:02:42 PM||Preferences limit memory usage when idle to 690.61MB
13/12/2008 2:02:42 PM||Preferences limit disk usage to 0.93GB
13/12/2008 2:02:42 PM||Running CPU benchmarks
13/12/2008 2:02:42 PM||Suspending network activity - time of day
13/12/2008 2:03:13 PM||Benchmark results:
13/12/2008 2:03:13 PM|| Number of CPUs: 1
13/12/2008 2:03:13 PM|| 1531 floating point MIPS (Whetstone) per CPU
13/12/2008 2:03:13 PM|| 2541 integer MIPS (Dhrystone) per CPU
13/12/2008 2:03:15 PM|SETI@home|Restarting task 21oc08ae.30412.5798.8.8.188_0 using setiathome_enhanced version 603
13/12/2008 2:02:09 PM||Running CPU benchmarks
13/12/2008 2:02:09 PM||Suspending computation - running CPU benchmarks
13/12/2008 2:03:20 PM||[error] FP benchmark ran only 0.953125 sec; ignoring
13/12/2008 2:03:20 PM||[error] CPU benchmarks error
13/12/2008 2:03:21 PM||Resuming computation
13/12/2008 2:03:26 PM|SETI@home|Task 21oc08ae.30412.5798.8.8.188_0 exited with zero status but no 'finished' file
13/12/2008 2:03:26 PM|SETI@home|If this happens repeatedly you may need to reset the project.
13/12/2008 2:03:26 PM|SETI@home|Restarting task 21oc08ae.30412.5798.8.8.188_0 using setiathome_enhanced version 603

Plus in stderr.txt:

Work Unit Info:
...............
WU true angle range is : 0.440438
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_vGetPowerSpectrum 0.00217 0.00000
sse1_ChirpData_ak 0.02845 0.00000
No heartbeat from core client for 30 sec - exiting
setiathome_enhanced 6.02 DevC++/MinGW
libboinc: 6.3.6

I look forward to a non-PC RTC heartbeat solution being coded into BOINC :-)
ID: 21769 · Report as offensive
Mike

Send message
Joined: 13 Dec 08
Posts: 4
United Kingdom
Message 21771 - Posted: 13 Dec 2008, 17:02:06 UTC - in response to Message 21770.  

Maybe you just need to replace the battery that powers the clock.

Yep, I've tried that when this first started happening. It didn't make any difference. I think it's software related, since after some reboot's I don't have the PC-clock-loosing-time problem. I've virus checked, spyware checked, and seti (in my case) is the only thing consuming significant processor time.

I've tried setting CPU usage to 90% in the online preferences, but that didn't help. Actually, I was suprised to see that changing to 90% CPU means that the process consumes 100% for 90% of the time, and 0% for 10% of the time. i.e. it's not a constant 90%.

But you're right, in that I really need to sort my PC-clock-loosing-time problem. But this isn't the place to address that, I guess. It's only that the "Task exited with zero status but no 'finished' file" has been around for a while, and on the BOINC Wiki it says:

"One of the causes of this message seems to be the setting of the computer's clock. When the time is adjusted the BOINC Daemon and the Science Application seem to get out of step. This should be fixed in the 4.7x/5.0.x release of the BOINC Client Software." (Link here)

And we're now at 6.2.x. OK, that is the unofficial BOINC Wiki, and most peoples PC clocks will be accurate enough (i.e. within +/-30 sec., which is a large error anyway) for this not to be a problem, so it's not going to be a priority.
ID: 21771 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 21782 - Posted: 13 Dec 2008, 23:46:14 UTC - in response to Message 21770.  

In many operating systems, while the computer is running, the clock is kept up-to-date separately from the "hardware clock" (the battery-backed one).
ID: 21782 · Report as offensive
Mike

Send message
Joined: 13 Dec 08
Posts: 4
United Kingdom
Message 21799 - Posted: 14 Dec 2008, 15:17:53 UTC - in response to Message 21782.  

OK, I need to admit something I've only just noticed. I'm using a very handy freeware time synchronisation tool called Karen's Time Sync. When I've been syncing my PC clock, it's been giving me a negative time difference which I assumed meant the amount of time my PC clock was behind the online clock. Now I actually have read it more closely, the time adjustment is "Difference/Adjustment to our clock". In other words, time is being subtracted from my PC clock - my PC clock is gaining time.

I rebooted my PC shortly after my last post and found my PC clock was maintaining accuracy (e.g. after about 18 hours, the error was 0.2 sec).

Dagorath said:
You need to make a list of all the possible reasons then eliminate them 1 by 1.

Well, I started running suspect programs that I thought might cause this problem, but the clock maintained perfect accuracy. I fiddled with the time, and tried rebooting, and just pressing the reset button, and now, after another reboot it's gaining time like the clappers again - approx. 8 seconds every 4 minutes.

Nicolas said:
In many operating systems, while the computer is running, the clock is kept up-to-date separately from the "hardware clock" (the battery-backed one).

Now that was something I didn't know. I found an open source program called ClockMon which shows the difference between the hardware clock and the operating system clock. Both ClockMon and Karen's Time Sync allow you to measure the difference between clocks without actually synchronising them with anything. And guess what? The difference between the hardware clock and the operating system clock is the same as the difference between the operating system clock and the online clock. Or in other words, the hardware clock is in sync with the online clock and maintaing time correctly - it's the operating system clock that's wrong and gaining time.

Elsewhere I have read that at boot time the operating system clock sync's itself with the hardware clock, then the operating system clock is maintained by interrupts generated by the motherboard bus "clock", while the hardware clock runs from a separate crystal on its own oscillator. I can understand how software could cause the operating system clock to loose time by making it miss interupts from the motherboard bus, but how could it ever gain time?

I have two workarounds: 1. reboot repeatedly until the clocks stay in sync., or 2. use ClockMon to keep my clocks within 30 seconds (I disconnect my internet connection, so can't use an internet based sync tool). Based on my approximate measurement above, I need to sync the operating system clock with the hardware clock every 60 minutes to not exceed a 30 second difference. Which is interesting, because from my first post I found that worst case BOINC lost the "heartbeat" and reset work units every 58 minutes.

Thanks for your help guys :-)
ID: 21799 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 21806 - Posted: 14 Dec 2008, 23:36:32 UTC - in response to Message 21799.  

Elsewhere I have read that at boot time the operating system clock sync's itself with the hardware clock, then the operating system clock is maintained by interrupts generated by the motherboard bus "clock", while the hardware clock runs from a separate crystal on its own oscillator.


And on Linux, it gets saved back to the hardware clock on shutdown. I'm not sure how it works on Windows; maybe it saves it whenever you change it.

I once had problems when trying to workaround DST issues. I changed my system clock, hours later I had a power outage, and when it booted again, I got the old time back. The hardware clock hadn't got updated, because it updates on a *clean* shutdown, obviously not on a power outage!

I can understand how software could cause the operating system clock to loose time by making it miss interupts from the motherboard bus, but how could it ever gain time?

If you were losing time because it misses interrupts, you could be having lots of other problems anyway...

ID: 21806 · Report as offensive

Message boards : BOINC client : Task exited with zero status but no 'finished' file

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.