Message boards : Questions and problems : BOINC 'suspened' but progress bar keeps moving
Message board moderation
Author | Message |
---|---|
Send message Joined: 7 Nov 12 Posts: 7 ![]() |
BOINC Ver: 7.0.28 (x64) (Windows) Anyway.. I've noticed recently that when I go AFK from my computer for a bit and then come back, BOINC is still 'crunching' away at tasks, even tho it says suspended. So.. it says "Suspended - Computer in use" but the progress bar keeps moving, and it takes between 1 and 10mins to stop.. kinda like it's trying to "finish" a task so to speak.. I usually see it with my video cards mainly.. Both cards are running 100% on the GPU, but the CPU is maybe ~5%.. Could there be an issue with GPU/CUDA tasks and BOINC suspended commands? Anyone else have this issue?? Yes.. Preferences are set properly. :P Intel 2600k @ 4.8ghz GTX 560Ti/GTS450 both OC'd Many MODs! :) |
Send message Joined: 7 Nov 12 Posts: 7 ![]() |
eh? no responses? :P |
Send message Joined: 23 Apr 07 Posts: 1112 ![]() |
eh? no responses? :P At what projects? with which of their apps? Best to ask at those projects. Claggy |
Send message Joined: 7 Nov 12 Posts: 7 ![]() |
eh? no responses? :P Seti@home :P |
Send message Joined: 23 Apr 07 Posts: 1112 ![]() |
eh? no responses? :P With which one of their apps? the Cuda_fermi 6.10 MB app or the OpenCL 6.04 AP app?, or are you running anonymous platform with one of a dozen or so other Cuda apps? Can we have a link to your host at Seti as there are no users called Siroaf there. Claggy |
Send message Joined: 7 Nov 12 Posts: 7 ![]() |
eh? no responses? :P I'm retarded, sorry, lol.. User: Astroman305 Sir Oaf is just for the forums here.. I'm not running anything else beside BOINC with Seti.. I also haven't sat down and studied every aspect of this program, nor whatever else goes into it.. I'm kinda 'set and forget' to a CERTAIN point.. depends on WHAT peeks my interest :P Note: It hasn't done what I wrote in my opening post since that day.. I was kinda thinking that maybe it was a problem between my screen saver with falling asleep to Netflix on fullscreen, paused.. Who knows.. :P |
Send message Joined: 23 Apr 07 Posts: 1112 ![]() |
eh? no responses? :P Simple, you're running Buggy Nvidia Graphics drivers: http://setiathome.berkeley.edu/result.php?resultid=2705433362 setiathome_CUDA: No CUDA devices found 295.xx and 296.xx drivers have a Sleeping Monitor Bug, once the Monitor goes to sleep the Cuda device disappears, when the next Wu starts because the cuda device isn't available the Cuda app goes into CPU fallback mode and takes forever to complete a Wu, quite often doing Maximum Time Exceeded, eithier run 290.xx or earlier drivers, or 301.xx and later drivers, I did have a sticky post on the Seti Number Crunching Forum, but it's been replaced by one by Richard: NVidia driver problems which cause computation errors Claggy |
Send message Joined: 7 Nov 12 Posts: 7 ![]() |
Thanks for the info! That's weird.. because it's using both of my video cards.. When the program starts after my idle time, both video cards are rackin' away at 100%.. so weird.. :P |
Send message Joined: 23 Apr 07 Posts: 1112 ![]() |
Thanks for the info! Both your cards? Only One is being reported: Computer 6821378 Coprocessors NVIDIA GeForce GTX 560 Ti (1023MB) driver: 306.97 Boinc will only use the most capable card by default, the other one will be stated as 'Not Used' in the startup messages, to utilise both you'll need to make a cc_config.xml with the following in it, drop it in your Boinc Data directory (the location is in your Boinc startup messages and is likely hidden), and restart Boinc: <cc_config> <options> <use_all_gpus>1</use_all_gpus> </options> </cc_config> Client configuration Claggy |
Send message Joined: 7 Nov 12 Posts: 7 ![]() |
Both your cards? Only One is being reported: Wow.. that is NOT the screen I saw last night! lol What it said last night was something along the line of: GTS 450 GPU [2] something.. I'll copy and paste it when I can, AS the computer specs seem to be changing.. That's funny... According to my monitoring software(evga precision) when seti starts running, both cards go upto 99%, and heat up.. so i assume they are doing something with seti.. :P But I'll look into that config file. |
Send message Joined: 11 Dec 12 Posts: 3 ![]() |
Hello I've been having a similar problem with my ATI GPU. Except that with mine it just keeps running until I shut down BOINC. It does this with workunits from any of the projects that use my GPU (PrimeGrid, SETI@Home Beta, Einstein@home). I don't remember exactly when it started, but it was within the last few months (I thought it might have just been a bad work unit the first few times I noticed it). I have my preferences set so that BOINC starts running after I'm away for a couple of minutes. When I come back the CPU tasks stop properly, but the GPU one often (not always) keeps running. When it does this, it doesn't respond to me trying to manually suspend it either. It says that it's suspended, but the time keeps progressing, and the task manager says that the process is still running. I have to shut down the BOINC Manager in order to stop the process. Any help or advice would be appreciated. Thanks |
![]() Send message Joined: 29 Aug 05 Posts: 15637 ![]() |
And that's with which BOINC version and on which operating system? If Linux and any version before 7.0.29, there's a bug in the idle detection for the (wireless) (USB) keyboards/mice in versions before 7.0.29 for Linux. Else, make sure that you use the correct preferences. Online or local. |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
And that's with which BOINC version and on which operating system? Provided both activity settings are set correctly: * Run based on preferences * Use GPU based on preferences there's no way that idle detection could work for the CPU but not work for the GPU. I've submitted two documented examples of the Einstein Windows GPU application (BRP4 v1.32) failing to respond to a 'suspend' instruction from BOINC (BRP4 1.31/1.32 GPU app release: feedback thread) In one case, two tasks were running on the same card at the same time: one noticed that it was supposed to suspend, the other carried on regardless. I'm beginning to think this is primarily an application problem, rather than a BOINC problem - it would be helpful if future posters to this thread could indicate which project/application is active when they observe the problem. |
![]() Send message Joined: 22 Dec 10 Posts: 14 ![]() |
Me2 on a Win 7/64 Pro w/ATI Firepro 4800 board. BOINC 7.0.28 (x64) This condition of the GPU continuing to work though suspense is indicated does not seem to occur immediately after rebooting but rather after some indeterminate length of time. It's caused me to shut down the connected client for the remainder of my uptime session. Have seen this only the in past 2 weeks, but had been away for the 2 weeks prior. Win 10-64 Pro on: Dual Xeon Quad E5472s 3.0GHz w/128GB DDR2 Main Memory + ATI FirePro W5000 GPU; Quad+HT i7-860 2.8GHz 8GB + ATI FirePro V4800 GPU; Quad Q6600-775 2.4GHz 4GB + ATI FirePro V4800 GPU; + 3 laptops |
![]() Send message Joined: 29 Aug 05 Posts: 15637 ![]() |
Me2 on a Win 7/64 Pro w/ATI Firepro 4800 board. BOINC 7.0.28 (x64) And you have told the affected project about this? As, as Richard indicates, it's very well possible that the project's science application is not 'listening' to what BOINC is telling it to do and ignoring the suspend request. then it's up to the project to fix that, as what else is BOINC to do about it when the science application ignores the command decisions? |
![]() Send message Joined: 22 Dec 10 Posts: 14 ![]() |
Thank you for the suggestion. As a longtime designer and coder, I would not have thought that the BOINC executive application was not engineered to be in a "command and control" position in relation to the running project code. I would have thought there to be a BOINC library module built into every project, that would allocate and grant continued access to resources, and that would report project task requests back to the executive application. If the exec app does not receive X such requests from a task per unit time, then the exec app would report the problem to a log file, to the GUI and perhaps even back to BOINC web central. I recall that method as being a simple way to complete the feedback loop on the goodness of task execution. It's possible that my machine's running - and not suspending - an IBM World Community Grid 'Help Conquer Cancer' task should have been information included in my posting, in that it is other than the Einstein or SETI project tasks that were reported earlier in the thread. Once a symptom appears across a variety of tasks, it is usually but not always the case that the problem is systemic. And the variety of GPU models would lead one to reduce in rank GPU driver or architecture as a possible causative source. Would you still recommend that this be reported to the project owners? Win 10-64 Pro on: Dual Xeon Quad E5472s 3.0GHz w/128GB DDR2 Main Memory + ATI FirePro W5000 GPU; Quad+HT i7-860 2.8GHz 8GB + ATI FirePro V4800 GPU; Quad Q6600-775 2.4GHz 4GB + ATI FirePro V4800 GPU; + 3 laptops |
Send message Joined: 6 Jul 10 Posts: 585 ![]() |
Can't remember having seen zombie processes being reported at the WCG forums in the recent weeks/month [yes report it at WCG with as much detail as you can collect]. Internally, science apps have a kill self type of switch. If the run-time is e.g. 10x greater than the original estimate, it's supposed to go south. These parms it is given at start, similar as that it is told at start how frequent it's allowed, per user setting, to write a progress backup to disk, at most. Also, if the core client does not hear of the science app for longer than 30 seconds, it's supposed to restart the process. Think it does that on basis of PID. If it does that 100x for an science task, the job is killed. Symptoms logged is °zero status... restart client... if this happens often...° series of messages. Coelum Non Animum Mutant, Qui Trans Mare Currunt |
Send message Joined: 11 Dec 12 Posts: 3 ![]() |
Sorry for taking so long to respond. I thought I had set it up to email me when there was activity on this thread... I'm on Windows 7, using BOINC version 7.0.28 x64. I'm using a hyperthreaded quad core, if that's helpful (yay 8 vCPU's!). I double checked the preferences and they are not set to use the GPU while the computer is in use (it does sometimes stop the GPU properly). The last task to do this (just a few minutes ago), was SETI@home Beta, the application is SETI@Home v7 6.99 (opencl_ati_sah). I've seen it happen with PrimeGrid and Einstein@home as well. I could record and post the specific application names next time it happens if that would be helpful. As others have suggested, it seems like (from the outside) this is a BOINC client issue since it's occurring with multiple projects and applications. Either that or there are just a number of projects that are missing something when they're writing their code in regards to properly suspending the GPU tasks. |
![]() Send message Joined: 29 Aug 05 Posts: 15637 ![]() |
Do you run more than one task at the same time on the GPU? Some people run 2 to 6 of them. And then it's possible that the 2nd to 6th application does not get the 'suspend' message from BOINC. So in that case is that an application bug, where it's ignoring what the BOINC client tells it to. Although no application builder builds his apps specifically to be run in multitude on a GPU. |
Send message Joined: 11 Dec 12 Posts: 3 ![]() |
I have one of those laptops where there's a lower powered Intel GPU as well as a better (but more power hungry) ATI GPU. I'm not sure how I could tell it to run more than one application on the GPU, or how to check if it's allowed to (I don't see any options in the preferences about that). I've never seen more than one application running with the GPU at once. And the client log only says that it recognizes the ATI GPU. |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.