4p wont upload work units.

Kardonxt

2[H]4U
Joined
Apr 13, 2009
Messages
3,679
This problem started when i made an image of the drive to keep as a backup. Not sure if it's related or just a coincidence.

The client pretty much just keeps doing this. Any ideas? Internet and dowloading work units works fine. I already deleted the work folder once to try to resolve the issue.

[01:34:19] Completed 115000 out of 250000 steps (46%)
[01:43:48] Project: 6901 (Run 23, Clone 3, Gen 153)


[01:43:48] + Attempting to send results [March 5 01:43:48 UTC]
[02:06:02] - Couldn't send HTTP request to server
[02:06:02] + Could not connect to Work Server (results)
[02:06:02] (130.237.232.237:8080)
[02:06:02] + Retrying using alternative port
[02:06:06] Completed 117500 out of 250000 steps (47%)
[02:31:02] - Couldn't send HTTP request to server
[02:31:02] + Could not connect to Work Server (results)
[02:31:02] (130.237.232.237:80)
[02:31:02] - Error: Could not transmit unit 01 (completed March 4) to work server.
[02:31:02] Keeping unit 01 in queue.
[02:36:38] Completed 120000 out of 250000 steps (48%)
 
Are you using Langouste? What do its logs say? (you can find them in /tmp/langouste-$USER)

Can you check output of 'df -h' to see if you're not running out of disk space anywhere?
 
linux noob here, I'm not even sure what langouste is but I don't have a folder for it in temp so I'm gunna guess no lol.

Here is the output from df -h

Filesystem Size Used Avail Use% Mounted on
/dev/sda1 282G 5.1G 263G 2% /
none 16G 220K 16G 1% /dev
none 16G 176K 16G 1% /dev/shm
none 16G 352K 16G 1% /var/run
none 16G 0 16G 0% /var/lock

Not sure if it helps but here is the log from when it finished teh wu and tried to upload.

[00:11:59] DynamicWrapper: Finished Work Unit: sleep=10000
[00:12:09]
[00:12:09] Finished Work Unit:
[00:12:09] - Reading up to 52713120 from "work/wudata_01.trr": Read 52713120
[00:12:09] trr file hash check passed.
[00:12:09] - Reading up to 47066080 from "work/wudata_01.xtc": Read 47066080
[00:12:10] xtc file hash check passed.
[00:12:10] edr file hash check passed.
[00:12:10] logfile size: 201224
[00:12:10] Leaving Run
[00:12:12] - Writing 100150372 bytes of core data to disk...
[00:12:13] ... Done.
[00:12:32] - Shutting down core
[00:12:32]
[00:12:32] Folding@home Core Shutdown: FINISHED_UNIT
[00:12:34] CoreStatus = 64 (100)
[00:12:34] Sending work to server
[00:12:34] Project: 6901 (Run 23, Clone 3, Gen 153)


[00:12:34] + Attempting to send results [March 4 00:12:34 UTC]
[00:29:33] - Couldn't send HTTP request to server
[00:29:33] + Could not connect to Work Server (results)
[00:29:33] (130.237.232.237:8080)
[00:29:33] + Retrying using alternative port
[00:47:36] - Couldn't send HTTP request to server
[00:47:36] + Could not connect to Work Server (results)
[00:47:36] (130.237.232.237:80)
[00:47:36] - Error: Could not transmit unit 01 (completed March 4) to work server.
[00:47:36] Keeping unit 01 in queue.
[00:47:36] Project: 6901 (Run 23, Clone 3, Gen 153)


[00:47:36] + Attempting to send results [March 4 00:47:36 UTC]
[01:06:04] - Couldn't send HTTP request to server
[01:06:04] + Could not connect to Work Server (results)
[01:06:04] (130.237.232.237:8080)
[01:06:04] + Retrying using alternative port
[01:24:33] - Couldn't send HTTP request to server
[01:24:33] + Could not connect to Work Server (results)
[01:24:33] (130.237.232.237:80)
[01:24:33] - Error: Could not transmit unit 01 (completed March 4) to work server.
[01:24:33] Keeping unit 01 in queue.
[01:24:33] - Preparing to get new work unit...
[01:24:33] Cleaning up work directory
[01:24:33] + Attempting to get work packet
[01:24:33] Passkey found
[01:24:33] - Connecting to assignment server
[01:24:34] - Successful: assigned to (130.237.232.237).
[01:24:34] + News From Folding@Home: Welcome to Folding@Home
[01:24:34] Loaded queue successfully.
[01:26:45] Project: 6901 (Run 23, Clone 3, Gen 153)


[01:26:45] + Attempting to send results [March 4 01:26:45 UTC]
[01:31:42] - Couldn't send HTTP request to server
[01:31:42] + Could not connect to Work Server (results)
[01:31:42] (130.237.232.237:8080)
[01:31:42] + Retrying using alternative port
[01:52:07] - Couldn't send HTTP request to server
[01:52:07] + Could not connect to Work Server (results)
[01:52:07] (130.237.232.237:80)
[01:52:07] - Error: Could not transmit unit 01 (completed March 4) to work server.
[01:52:07] Keeping unit 01 in queue.
[01:52:07] + Closed connections
[01:52:07]
[01:52:07] + Processing work unit
[01:52:07] Core required: FahCore_a5.exe
[01:52:07] Core found.
[01:52:07] Working on queue slot 02 [March 4 01:52:07 UTC]
 
Have you been able to return BA units from this machine before?

Is it a GUI installation?
If so, can you try opening http://130.237.232.237 and http://130.237.232.237:8080 in in a web browser?
It should report "OK".

If not, try "downloading" them with wget, e.g. wget http://130.237.232.237:8080
and watch for any errors or timeouts. If connections succeed, you will see something
like this:
Code:
--2012-03-05 09:04:36--  http://130.237.232.237:8080/
Connecting to 130.237.232.237:8080... connected.
HTTP request sent, awaiting response... 200 HTTP_OK
Length: unspecified
Saving to: “index.html”

    [ <=>                                   ] 0           --.-K/s   in 0s      

2012-03-05 09:04:37 (0.00 B/s) - “index.html” saved [0]
(note 200 HTTP_OK).
 
That WU server might be down now. That does happen from time to time. Give it a bit and see if it can transmit the WU later.
 
Both those were able to open and i tried pinging them just to be safe and that worked too.

This build has uploaded before. It stopped uploading work units a few weeks ago right after i made an image of it and rather than sorting it out i just deleted the work unit folder and que. Then i just shut the machine down while i waited for my new mobo to get here.

I'm pretty sure it's an issue on my end that I could solve with a reload but I just didn't want to lose my current work units.
 
If both URLs worked (from the folding rig, ofc.) it may have been intermittent issue, per blown 402 c.i. ...
Let know if you see it again.
 
i did plug the backup image drive into another machine because my image software told me i would need to repair something after the image which i was able to do with some on line tutorials. I deleted teh WU folder and que.dat and dled a work unit just to make sure it worked (never finished the wu or tried to turn it in just wanted to make sure the image was ready to rock.).

Is it possible having 2 linux machines with identical info on the same network messed something up? I know windows machines will bitch about 2 machines with the same name being on a network but they go back to working when you shut one down.

I don't think it's an intermittent issue because i had the same problem before i shut the machine down weeks ago that rather than solving I just turned off till my new parts got here. I'll let it finish up the 6903 i guess and if it still doesn't work ill just have to reload it. The machines been trying to upload that unit for over a day now.
 
Last edited:
If both linux machines have the same IP address and are in the same network/subnetwork, that can cause problems.
 
That machine was only up for like an hour and isn't up now so unless linux makes some weird changes when it detects 2 of the same machine names on the network i don't think that's the problem. Ip's are set to dhcp so they shouldn't have had the same ips at any point. I wish i knew more about linux tho, I could be way wrong.

I have a feeling im gunna have to flush 300,000 points down the drain to fix this lol QQ.
 
I have a feeling im gunna have to flush 300,000 points down the drain to fix this lol QQ.

Do you have another machine running linux that can send WUs? If so, I would try backing up the entire F@H folder and move it to that machine then try sending from there. If not, save the backup somewhere else and reinstall linux. Then just copy the folder back.
 
good plan chelsea. I backed up the fah directory and will see if i can get it to upload after the reload.
 
unfortunately it still didn't work with my old fah directory even after a reload. Not sure what was up but its a fresh system now so if i still have problems I'll post back.

Thanks for the input guys.
 
my laptop does this regularly, but they go in the end. I did start a thread about it a few weeks ago.

Try ./fah -send all

See if that jollies them along :)
 
Tried it before I flushed the points and still wouldn't go through QQ. Thanks for the advice tho.
 
Well that was a waste. finished a 6903 and it wont upload again. I think it maybe the Internet at work. I will take the rig home and give it another shot. Would be a shame to lose out on free electricity tho.
 
I had uploaded work units here in the past. It's a real small business and we run a residential modem and gateway that I setup so I didn't think it would be the problem lol.

But our internet connection does get flaky some times so it could just be an issue with teh cheap dsl.
 
Internet was the problem. Work unit uploaded no problem from home. Thanks for all the help guys
 
Back
Top