Thread 'Anything and Everything to do with (WCG) World Community Grid'

Message boards : Projects : Anything and Everything to do with (WCG) World Community Grid
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · Next

AuthorMessage
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 516
Sweden
Message 116745 - Posted: 29 Aug 2025, 12:03:02 UTC

WCG Operational Staus update https://www.cs.toronto.edu/~juris/jlab/wcg.html (click operational status heading) - August 27

August 27, 2025
MAM1 7.07 updates:
The addition of spdlog as a dependency to replace the previous debug level printouts with more useful output for those who like to look at stderr.txt before it gets cleaned up by the BOINC client.
Some kludgey math, flags, and options now set in the application's main function to try and get Ensmallen -> which depends on Armadillo -> which depends on OpenMP/OpenBLAS -> which nested thread creation causing suspension of concurrent running tasks under the BOINC client and using more CPU than the plan class and --nthreads parameter dictated. Essentially, bad behaviour. Thanks for posting feedback for the first few batches released, I will endeavour to address any further feedback if MDMG/MAM1 continues to over-schedule w.r.t the plan class or otherwise behaves badly.
Added two built-in, configurable options for adjusting learning rate when using the LibTorch backend which was observed to improve the model's avg. loss progression during cross validation in fewer epochs, corresponding to:
https://docs.pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ReduceLROnPlateau.html
https://docs.pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingLR.html
Other fixes, additional work on features that will release in later versions after the migration.
ID: 116745 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 516
Sweden
Message 116748 - Posted: 29 Aug 2025, 22:34:10 UTC

New update: https://www.cs.toronto.edu/~juris/jlab/wcg.html (click operational status heading) - August 29.
Also pushed to the BOINC client.

August 29, 2025
Full migration of WCG from the Graham to Nibi cloud facilities will be completed between 3:00-5:00 p.m. on August 31st, 2025
Sharcnet will then power down all hardware at Graham.
We have put in a ticket with UHN Digital to move our DNS records to the new IP addresses we have been allocated in Nibi cloud, and all storage, networking, and compute resources are already provisioned at Nibi.
We continue testing QA and Prod on the new infrastructure.
We will experience some downtime as *.worldcommunitygrid.org URLs switch over. We will be bringing down workunit creation scripting, BOINC server components, and upload/download servers in sequence, halting the database, performing a final rsync and then bringing down the website, forums, and internal services over the next 48h.
In the best case, our DNS records will be switched over on the 31st and everything behind the load balancer will be up and running. However, we want to prepare users for the possibility of additional downtime as we stand up prod on Nibi.(
ID: 116748 · Report as offensive     Reply Quote
Profileunixchick

Send message
Joined: 28 Mar 18
Posts: 141
United States
Message 116751 - Posted: 30 Aug 2025, 17:45:17 UTC - in response to Message 116748.  

Thanks for posting this link. I had forgotten about this thread. I'm here. As an obsessive poster this will be hard on me :-)
ID: 116751 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 516
Sweden
Message 116752 - Posted: 30 Aug 2025, 18:28:22 UTC - in response to Message 116751.  

In reply to unixchick's message of 30 Aug 2025:
Thanks for posting this link. I had forgotten about this thread. I'm here. As an obsessive poster this will be hard on me :-)
I'm sure we will get through this migration too. Uploading of tasks are still working, but not reporting and asking for new work. The WCG forum and the rest of the site is also still online.
ID: 116752 · Report as offensive     Reply Quote
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1362
United Kingdom
Message 116753 - Posted: 30 Aug 2025, 22:03:41 UTC - in response to Message 116752.  

...and now all down, website is giving a 503.
Let's hope this move is a smooth and rapid one.
ID: 116753 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 516
Sweden
Message 116754 - Posted: 31 Aug 2025, 1:24:20 UTC

Yes indeed. Everything WCG is down and out. With 152 tasks left to crunch in the cache, I believe my slow old (Core(TM) i7-3630QM CPU @ 2.40GHz) Laptop, will still have work left when WCG is back, even if it takes a day or so longer than planned for WCG to come back online.
ID: 116754 · Report as offensive     Reply Quote
Profileunixchick

Send message
Joined: 28 Mar 18
Posts: 141
United States
Message 116756 - Posted: 31 Aug 2025, 19:03:34 UTC

I think it is 3pm Toronto time... hopefully it will come back soon in 2 hours, but some of that depends on DNS tables
ID: 116756 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 516
Sweden
Message 116757 - Posted: 31 Aug 2025, 19:16:18 UTC - in response to Message 116756.  

In reply to unixchick's message of 31 Aug 2025:
I think it is 3pm Toronto time... hopefully it will come back soon in 2 hours, but some of that depends on DNS tables
Yeah, let's hope that everything works as they have planned it. I still have 100 tasks left to crunch,
34 tasks ready to report (uploaded before the outage), and 60 tasks waiting to upload and report.
ID: 116757 · Report as offensive     Reply Quote
Profileunixchick

Send message
Joined: 28 Mar 18
Posts: 141
United States
Message 116758 - Posted: 31 Aug 2025, 20:21:05 UTC

We now get this message on the WCG web page

"503: WCG Migration to Nibi Cloud in Progress
See the "Operational Status" tab of the Jurisica Lab WCG pages for details: https://www.cs.toronto.edu/~juris/jlab/wcg.html "
ID: 116758 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 516
Sweden
Message 116759 - Posted: 31 Aug 2025, 21:26:04 UTC - in response to Message 116758.  
Last modified: 31 Aug 2025, 21:26:48 UTC

In reply to unixchick's message of 31 Aug 2025:
We now get this message on the WCG web page

"503: WCG Migration to Nibi Cloud in Progress
See the "Operational Status" tab of the Jurisica Lab WCG pages for details: https://www.cs.toronto.edu/~juris/jlab/wcg.html "
Thank you unixchick. I just checked, and indeed, that's the message now, on the WCG pages. Still from the old Graham adress though
(gra-cloud118.graham.sharcnet.ca 199.241.167.118).

So, Graham is still active, and not powered down.
ID: 116759 · Report as offensive     Reply Quote
Profileunixchick

Send message
Joined: 28 Mar 18
Posts: 141
United States
Message 116760 - Posted: 1 Sep 2025, 5:37:04 UTC

Can't reach the web page now. I get a time out error.
ID: 116760 · Report as offensive     Reply Quote
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1362
United Kingdom
Message 116761 - Posted: 1 Sep 2025, 5:39:45 UTC - in response to Message 116759.  

Another change.....
The forum & home pages are now blank, with no error messages (402/503 types), so are we in the middle of the DNS change over?
ID: 116761 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 516
Sweden
Message 116762 - Posted: 1 Sep 2025, 9:33:25 UTC - in response to Message 116761.  
Last modified: 1 Sep 2025, 9:34:36 UTC

In reply to robsmith's message of 1 Sep 2025:
Another change.....
The forum & home pages are now blank, with no error messages (402/503 types), so are we in the middle of the DNS change over?
Yes, the new IP to WCG seems to be 199.241.161.110, instead of the old 199.241.167.118. However Nibi cloud,
the server and/or the DNS change, probably isn't totally ready yet. I get "ERR_CONNECTION_TIMED_OUT" now.
ID: 116762 · Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1603
United States
Message 116763 - Posted: 1 Sep 2025, 17:27:21 UTC - in response to Message 116762.  

It's mid day in Toronto and still no communication on the Krembil/WCG status site.

I get the same DNS pointers to the new IP address but no website.

Will just have to wait for a sign of intelligence from Krembil.
ID: 116763 · Report as offensive     Reply Quote
robsmith
Volunteer tester
Help desk expert

Send message
Joined: 25 May 09
Posts: 1362
United Kingdom
Message 116764 - Posted: 1 Sep 2025, 17:39:34 UTC - in response to Message 116763.  

The sign of intelligence we need is not from Krembil, but from the new cloud host (and this sort of thing is all too common when moving from one cloud environment to another....).
ID: 116764 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 516
Sweden
Message 116765 - Posted: 1 Sep 2025, 17:41:23 UTC
Last modified: 1 Sep 2025, 17:46:54 UTC

No signs of life when it comes to BOINC's connection to WCG either. Yeah well, the waiting game continues....

Btw: the "new" cloud host, is the same as the old one. Still Sharcnet (UHN Digital). It's just a new cloud environment within the same host.
ID: 116765 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 516
Sweden
Message 116767 - Posted: 1 Sep 2025, 23:03:06 UTC

Well, I guess this migration didn't go as planned. Still no contact with WCG website, or through BOINC.
It seems as if I really needed the upped cache.
ID: 116767 · Report as offensive     Reply Quote
Profileunixchick

Send message
Joined: 28 Mar 18
Posts: 141
United States
Message 116769 - Posted: 2 Sep 2025, 13:48:48 UTC

Nothing yet. Can we tell if it is a DNS thing or a machine thing?

If it is a DNS thing then it is just time. If it is a machine thing, then I wish we had info.
ID: 116769 · Report as offensive     Reply Quote
Hans Sveen

Send message
Joined: 3 Nov 20
Posts: 5
Norway
Message 116770 - Posted: 2 Sep 2025, 14:52:07 UTC
Last modified: 2 Sep 2025, 15:46:07 UTC

Hi!
Seems like DNS problem !??

https://dnschecker.org/all-dns-records-of-domain.php?query=www.worldcommunitygrid.org%2F&rtype=ALL&dns=google

Hans S.

Ps.
From the latest update (Aug.29) and last bullet point paragraph it says:

In the best case, our DNS records will be switched over on the 31st and everything behind the load balancer will be up and running. However, we want to prepare users for the possibility of additional downtime as we stand up prod on Nibi.(
ID: 116770 · Report as offensive     Reply Quote
[CSF] Aleksey Belkov

Send message
Joined: 3 Mar 23
Posts: 17
Russia
Message 116772 - Posted: 2 Sep 2025, 15:40:19 UTC - in response to Message 116770.  
Last modified: 2 Sep 2025, 15:43:51 UTC

In reply to Hans Sveen's message of 2 Sep 2025:
Hi!
Seems like DNS problem !??
Hans S.

No, there's no DNS problem at domain zone(worldcommunitygrid.org) or with particular web-site(www.worldcommunitygrid.org).

Obviously, the problem at application level.

P.S
If you want to check some resource for "DNS" problems , then you should use more "tech-savvy" tools, like Hardenize:
https://www.hardenize.com/report/worldcommunitygrid.org/
or Zonemaster:
https://zonemaster.net/en/result/55ccbde7e145a1fb
ID: 116772 · Report as offensive     Reply Quote
Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · Next

Message boards : Projects : Anything and Everything to do with (WCG) World Community Grid

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.