GPU queues just dumped. We're back to folding

relic

[H]ard|DCer of the Month - August 2007
Joined
Mar 30, 2001
Messages
9,318
Looks like someone braved the hangover and went in to reboot the servers. ;)
 
Yeah, Vijay Pande himself checked them and kicked them back into gear ;) We should applaud him for coming to check on a sunday :rolleyes:

 
Yep.. mine are starting to upload as well. Still some hanging but better.

I also read they're going to start benchmarking longer WU tomorrow (8-24 hrs.) which should help resolve some of the ongoing load issues on the CS.

 
BTW, to avoid similar issues in the future, we will be betatesting larger WU monday. By larger, they mean a WU which can be crunched in 8-24 hours (basically multiplying the current WU crop by 3-4x).

The worst thing is that I already told them since the beginning to make larger WU. The 98 pointers is just dumb and multiplying by 5 helped but that's not enough...

 
Yeah, Vijay Pande himself checked them and kicked them back into gear ;) We should applaud him for coming to check on a sunday :rolleyes:


Applause…..humm applause……isn’t applause something like clapping?

Oh, I’m in, I vote we all get VJ the Clap;):rolleyes:
 
that would be great, if they doubled the size of the WU's. 960 points at a time.

Though it might be a problem for the slower GPU's? Like an 8400, etc..

But if they doubled it, then my PC would only be connecting about 5 times a day. Instead of 10x currently (10x480 = 4800PPD). That simple change could cut the server traffic in half!!
 
that would be great, if they doubled the size of the WU's. 960 points at a time.

Though it might be a problem for the slower GPU's? Like an 8400, etc..

But if they doubled it, then my PC would only be connecting about 5 times a day. Instead of 10x currently (10x480 = 4800PPD). That simple change could cut the server traffic in half!!

And gain 100-200 ppd in the process since we waste less time uploading/downloading ;)

 
Applause…..humm applause……isn’t applause something like clapping?

Oh, I’m in, I vote we all get VJ the Clap;):rolleyes:

he he he... :D Mr relic, you are one "bad" dude :eek: (oh, I use the word "bad" as closely resembling "good" and as a compliment ;))

What are you tryin' to do give VJ some kinda' complex or something ? (only kiddin') :p

FOLD ON!
 
I'm still experiencing problems with some of my clients. Has everyone got all their queues uploaded?

 
Spoke too soon, my queues emptied while the current WU was crunching, but now :

[18:45:57] + Attempting to get work packet
[18:45:57] - Connecting to assignment server
[18:45:57] - Successful: assigned to (171.64.122.74).
[18:45:57] + News From Folding@Home: GPU folding beta
[18:45:58] Loaded queue successfully.
[18:45:58] - Couldn't send HTTP request to server
[18:45:58] (Got status 503)
[18:45:58] + Could not connect to Work Server
[18:45:58] - Attempt #11 to get work failed, and no other work to do.

Back to family time I guess. ;)
 
Yeah, apparently there's a 4-page thread at the FCF about the problem. Seems the servers overloaded so they added another one and it still wasn't up to the task... :eek:
 
I'm having issues. Back to the CPU till things are fixed...

Silly GPU client with its big numbers. My A64 4000+ is doing 145ppd on a SimT Gromacs with a point value of 15 points. Better than letting my GPU sit there doing nothing!

 
I picked one heck of a week to go on vacation. :rolleyes:
There's not much anyone can do ATM. We just have to wait and see how Stanford resolves these issues. There's a large backup of queues and many clients attempting to access the servers at once.
 
I've managed to keep one of 2 GPUs folding. When one gets done the other gets a WU
 
I always manage to keep at least 1 GPU2 fed out of 3 (it seems each gpu take it turn with the inability to download a workunit).

 
It appears to at least be somewhat functional right now, i just picked up a work unit and we're going again for now.
 
Mine finally submitted all finalized work, but spent quite a bit of today without work. By the looks of it, my GPU client spent 3 or 4 hours today doing nada.

 
looks like everyone felt it in the stats too

i got spared the horror since only 10% of my PPD comes from GPU's at the moment lol
 
my queues were filling up again, back at home. I think I had 4-5 in queue this morning.

whats the problem here.
 
Vijay found a few issues with the collection server and one of them is having a error 503 timeout being too generous (timeout delay too long probably), filling all the slots too fast. They upped the number of slots from 300 to 1500 in hope to improve the situation. However, it's not 100% perfect and today, they will do more tweaks to improve the situation along with testing longer workunits.

 
I have some results.dat files in a couple of my GPU clients and FahSpy is reporting those clients in the 'ready to send' status, but whenever the clients upload the results they report as having no more unsent WUs. Should I delete those files or are there actual results not sent to the servers yet? Is this issue finally resolved?

 
I have some results.dat files in a couple of my GPU clients and FahSpy is reporting those clients in the 'ready to send' status, but whenever the clients upload the results they report as having sent all the unsent WUs. Should I delete those files or are there actual results not sent to the servers yet? Is this issue finally resolved?


No yet. Just leave the files there in case. If it is not needed indeed, it will get deleted themselves when you get back to the queue slot.

 
Anyone still having issues? I took a look at my GPU log this morning and it spent an awful amount of time trying to upload AND retrieve work. It spent probably half the night last night idle due to a lack of work.

 
Not me, all is working fine since yesterday afternoon.

 
I still get ocassional slow downs due to inability to upload. It generally fixes itself by attempt #3... Why do I get the feeling they never anticipated such a large increase in Nvidia clients?

 
I wonder if they believed the adoption rate was going to be similar to the ATI Client? It may be me, but it seems as if the nVidia client is getting much more exposure than the ATI client. You increase exposure, you'll increase adoptability. Given the number of nVidia GPUs out there, that can be a significant boost to the cause. Of course, if you don't increase the number of servers feeding all of these clients, then you're bound to have issues. As others have said, the 98 point WU's from the early days, nearly killed them. I don't believe the ~480 points of the current iteration of WUs is large enough. Even 8800GTs are chewing through 10-12 per day (very rough estimate assuming 2 or so hours per WU). Multiply that by the number of 8800GTs that even our team is using, that's a significant hit to the servers. Obviously newer and faster GPUs are pushing even MORE WUs.

Getting more servers is not easy. It is even harder when you're a not for profit organization who is basically on a shoestring budget. If we could only get corporations to donate server cycles to help feed our community.

Time will tell how successful this new client is going to be. My biggest concern is the number of people that will be turned away due to lack of work or worse, inability to turn in work that is already complete. I know that many have been frustrated over the weekend. This was obviously not the first weekend that we have had these similar issues and I, unfortunately, suspect this will not be the last.
 
I get drops & spikes. Normally I put out about at least 1 WU per update. The past few days, I'll go like 2-3 updates with 0 points, then all the sudden 2-3 WU's at once hit & I shoot up 1440 points
 
They need to get some bigger WU's out the door that will alleviate their current problems. Though it's still pretty clear to me that they have larger issues to deal with. Putting out bigger WU's would be a good band-aid for now though.
 
I had issues for a few minutes getting work this morning, but thats the only hiccup I've had
 
None of my GPU-2 boxen were working this morning. They were unable to upload their work, and unable to download some new work.

Anyone else have this problem? :confused:
 
I was ok this morning. I have several that were holding WUs to be turned in... but none had stopped as of 7am CT.

I'm hoping I don't come home to all 11 gpus sitting idle... :(

 
No problems for any of my gpu2 clients the past 24 hours at least as far as I can tell.


 
Back
Top