Problems sending -bigadv WUs

APOLLO

[H]ard|DCer of the Month - March 2009
Joined
Sep 17, 2000
Messages
9,089
I have problems sending -bigadv WUs from one of my clients installed in an EVGA VM. The client attempts to upload results and my connection is tied up for a long time as expected but the WU does not actually upload successfully. Meanwhile when this happens, memory usage is always at full and even restarting the VM won't reduce it an iota. Only stopping the VM will release memory the VM is using. :confused:

The client will also delete the WU and remove it from the queue if the process is taking an undue length of time. I have lost a couple of -bigadv WUs to this problem in the past week. It seems something else might be the problem especially regarding the memory usage issue but cannot find exactly what. Could this be a VMware specific issue or is the client itself the problem? I also have tons of connectivity issues with this VM and deleting it for a fresh install does not help either. :confused:

Any ideas would be appreciated.
 
Wow... with so many parts...

Rebuild (re-download) the OS inside the VM
VMWare could need a reinstall
Windows hates you, and you need to show it who is boss (reinstall or something else I'm not sure/aware of)
 
OK, it appears I received the credit for the last WU today, so it uploaded fine despite never getting a confirmation message in the log. Very, very weird. Unfortunately, I am still not able to DL any new work in that VM and it is idle. Since I have no incomplete WU in this VM, I might just delete it and reinstall - again. I don't think that will solve the problem, though. I did that earlier this week and it didn't work. Everything was working fine for nearly two months before this problem appeared... :confused: :(
 
Now that my first -bigadv WU has 2% left to finish, i'm concerned. I just hope Stanford hurries up and releases -bigadv for the Windows SMP client since I get nervous when I have to use an "emulator" type of program to get a program working. Emulators are never perfect, especially for people with bad computer karma like me.
 
Now that my first -bigadv WU has 2% left to finish, i'm concerned. I just hope Stanford hurries up and releases -bigadv for the Windows SMP client since I get nervous when I have to use an "emulator" type of program to get a program working. Emulators are never perfect, especially for people with bad computer karma like me.

Don't be. This is a weird thing. I've never had it happen in all my time running -bigadv
 
Now that my first -bigadv WU has 2% left to finish, i'm concerned. I just hope Stanford hurries up and releases -bigadv for the Windows SMP client since I get nervous when I have to use an "emulator" type of program to get a program working. Emulators are never perfect, especially for people with bad computer karma like me.
I've been running several -bigadv clients since October. This problem only occurred at the beginning of the week. You will be OK.

Can the VM still get on the Internet, like ping any addresses?
The VM has connected to Stanford. It's working on a -bigadv WU now but I bet it will have the exact same issue again once it completes. If you have any suggestions to test it I'm willing to try them.

I'm wondering what could be using the memory once the client completes the WU? All the memory allocated to the VM is in use even when the client attempts to upload results. I think this is directly related to the problem I'm having sending completed work, despite the apparent separateness of both issues.
 
Does the VM use the allocated memory all the time or just when folding? I just started it the other day, but didn't even look when I got it set up.

I have a VM appliance at work that does spam filtering and I want to say that it always uses the amount of memory allocated to it. I'll have to double check on Monday though.
 
Does the VM use the allocated memory all the time or just when folding?
The last time I had the problem it was continuously used. Get this, even when I restarted the VM it was still in use!! Only when I shut it down did it release its hold on the allocated memory. :eek: :confused:
 
Weird for sure.

Although it was done about 30 minutes ago, mine still says attempting to send. I'm tempting to shut it down then restart it to force it to send, but not sure if it's doing any other clean up type stuff. The connection is fast, 2.5Mb so it should't take long to send.


Funny now it actually says couldn't send http request to server. Could not connect to work server (results). Apparently I might be having issues also...
 
Although it was done about 30 minutes ago, mine still says attempting to send. I'm tempting to shut it down then restart it to force it to send, but not sure if it's doing any other clean up type stuff. The connection is fast, 2.5Mb so it should't take long to send.


Funny now it actually says couldn't send http request to server. Could not connect to work server (results). Apparently I might be having issues also...
Keep us posted on this issue. I would give it at least 30 minutes to attempt an upload. Even though you have a faster connection than I do, you never know Stanford might be experiencing slowdowns with their network.
 
I'm uploading 2 bigadvs right now:). So far I have connected to the server and one leg had stated that it had posted data.Oh woot! VM1 just finished uploading and has downloaded a 2682. Cable is nice.
 
Keep us posted on this issue. I would give it at least 30 minutes to attempt an upload. Even though you have a faster connection than I do, you never know Stanford might be experiencing slowdowns with their network.

I restarted the VM...had the same issue as you did though. Doing a restart didn't free up the memory. Shutdown the VM and the memory was freed up.

Still haven't uploaded the work unit either. I'm trying to do it right now with the -send all command. It seems to be sending now, although previously in the attempts it would start out fine, then just stop transmitting. I changed the gateway to use the DSL line instead, so I'm not sure if it was that or it just decided to work.

I'm just running the Windows SMP client for now, until I can get this one sent.
 
I restarted the VM...had the same issue as you did though. Doing a restart didn't free up the memory. Shutdown the VM and the memory was freed up.

Still haven't uploaded the work unit either. I'm trying to do it right now with the -send all command. It seems to be sending now, although previously in the attempts it would start out fine, then just stop transmitting. I changed the gateway to use the DSL line instead, so I'm not sure if it was that or it just decided to work.

I'm just running the Windows SMP client for now, until I can get this one sent.
Holy smoke, and I thought that mine was an isolated case. With both incidents, I was forced to use the -send all flag as well, but the first time it didn't seem to work as far as I could tell. I gave up and deleted the VM. Second incident it was fortunately successful.

I don't know what to make of this problem. I can't really blame VMware or the VM image. They worked fine for a long while and the problem remained even when I loaded a new VM. It must either be the client or a new series of WUs that are corrupt in some fashion. It's the WU that consumes the memory, not the VM, so something is causing the WU to keep a grip on the memory and this is creating upload problems. :confused: :mad:
 
It actually was able to send and I got points for the work unit. This morning I'm going to try running another one, but this time with using the DSL circuit , to see if that makes a difference. The strange thing is that when I tried the send all on the other connection, it actually started out fine. It capped the connection at something like 350KB/sec for about 30 seconds or so, then dropped to basically nothing.

I think it might have been just coincidence that when I switched connections, it worked. I ran send all probably about 5 times on the ethernet line before switching it over to DSL.

I'll give an update when the unit is complete in about 3-3.5 days.
 
VMWare memory usage is tricky. If you assign it x GB of RAM, it will take ALL x GB of RAM. The memory is only given back to Windows when you shut down the VMWare instance that is running (either via player, server, or workstation). This is the nature of the beast.

This is why you never assign a VM more memory than you actually physically have on a box. We know not to do that, but others may not. You can only assign as much PHYSICAL RAM as you have available keeping in mind to save some for windows.

Something to try if you are having send (or download issues for that matter) is to disconnect the network adapter (via the VMWare Control Panel) and then connect it.

If your home network has hiccups (which unfortunately mine has been having of late due to a possibly bad router), disconnecting and reconnecting will cause the VM to pickup / refresh its IP Address and possibly rectify the issue.

You do not notice that it has a network issue until you are trying to do an upload/download because it doesn't need it until then. I believe (Linux gurus can jump in here cause I may be talking out of my a$$) there is a way in Linux to keep your network connection refreshed. I'm pretty sure I've seen it, but don't know where to begin looking. I would look for you, but I'm back in MSP and don't have a Linux VM running locally that I can take a look at and am about to jump into a 2 hour staff meeting.
 
My issue is a new problem that manifested last week. When I start my VM, the full amount of the memory that I have configured for it is not in use according to Task Manager. It steadily increases after I start the client. That is how it has always worked on my systems. Allocation of memory and memory use are not displayed the same in Task Manager. My VM in the other system is not having any problems, and it has been in operation longer with the exact same system specs except for processors. Both systems are running identical motherboards and RAM amounts.
 
I'm not sure if its 100% a network issue within the VM. There is some kind of issue(I think), but it's not present all the time.

If the networking portion was disconnected the whole time, while the VM was working on the WU, you would notice it with Fahmon, HFM.net, etc.
 
An an update, the second one just finished sending without issue. I was up to 7GB used total between Windows and the VM, which is set for 5GB. It took about 17 minutes to complete everything, including the upload, after the WU was complete.

I just started another one, after restarting my pc to install some Windows updates.

I have the default gateway set again to the DSL circuit. Hopefully it was just the internet connection that was the issue, the first time.
 
An an update, the second one just finished sending without issue. I was up to 7GB used total between Windows and the VM, which is set for 5GB. It took about 17 minutes to complete everything, including the upload, after the WU was complete.

I just started another one, after restarting my pc to install some Windows updates.

I have the default gateway set again to the DSL circuit. Hopefully it was just the internet connection that was the issue, the first time.
Well, I continue to have problems. Last time was an hour ago. I had to stop the client, shut down/restart the VM and use the -send all flag. My total system memory is 6GB with 5400MB allocated to the VM. The WU uploaded, but I don't see why it's behaving this way, and why I need to use the -send all flag when I didn't have to do this before. My other nearly identically configured system is doing fine. :confused: :confused: :confused:
 
Well, I continue to have problems. Last time was an hour ago. I had to stop the client, shut down/restart the VM and use the -send all flag. My total system memory is 6GB with 5400MB allocated to the VM. The WU uploaded, but I don't see why it's behaving this way, and why I need to use the -send all flag when I didn't have to do this before. My other nearly identically configured system is doing fine. :confused: :confused: :confused:

Strange for sure.

Can you actually tell if it was using your connection or not?

What is troubling is this: when I had the issue sending, it would actually start out and send just fine for about 15-30 seconds or so. I restarted the client mutiple times and it would do that every single time. It would start out normally, then just stop sending data. It sort of points to some kind of internet issue, like packet loss or something like that. Funny thing, I haven't had any issues on the A3 clients that are using that other connection(ethernet over T1).
 
Can you actually tell if it was using your connection or not?
If it's the connection, then it's only affecting this system. I can try installing a network card in there and see if it makes any difference.

What is troubling is this: when I had the issue sending, it would actually start out and send just fine for about 15-30 seconds or so. I restarted the client mutiple times and it would do that every single time. It would start out normally, then just stop sending data. It sort of points to some kind of internet issue, like packet loss or something like that. Funny thing, I haven't had any issues on the A3 clients that are using that other connection(ethernet over T1).
I have two GPU clients in Win 7 on the same system that aren't having any problems at all. So, if it's a connectivity issue then it's either Linux related or VM related, or a combination of both. I do receive a packet message from the client when it's ready to upload telling me that the packet size will be reduced. Maybe this is related to the issue? Unfortunately, the log doesn't record it and I can't post the message.

Besides changing my network interface, I was thinking of installing another Linux distro with the possibility it could somehow be an issue with the EVGA VM. If everything else fails to fix the problem, I'll set up another VM with Ubuntu.
 
If it's the connection, then it's only affecting this system. I can try installing a network card in there and see if it makes any difference.

I have two GPU clients in Win 7 on the same system that aren't having any problems at all. So, if it's a connectivity issue then it's either Linux related or VM related, or a combination of both. I do receive a packet message from the client when it's ready to upload telling me that the packet size will be reduced. Maybe this is related to the issue? Unfortunately, the log doesn't record it and I can't post the message.

Besides changing my network interface, I was thinking of installing another Linux distro with the possibility it could somehow be an issue with the EVGA VM. If everything else fails to fix the problem, I'll set up another VM with Ubuntu.

It would seem that it its not a connection issue, at least on your actual internet connection. If it were you'd have issues with everything, not just this one VM being able to send back data.

I definitely didn't have that packet size message in my log. It just had the normal connecting to http://171.67.108.22:8080 message.

Do you recall if if had that reducing packet size message the other time it had an issue sending?
 
Do you recall if if had that reducing packet size message the other time it had an issue sending?
Yes, on every occasion I had this upload problem that packet message comes up. Because it's not recorded in the logs for some reason, I cannot go back to a time before this problem surfaced to determine if that message was also coming up in the past.
 
Yes, on every occasion I had this upload problem that packet message comes up. Because it's not recorded in the logs for some reason, I cannot go back to a time before this problem surfaced to determine if that message was also coming up in the past.

Have you tried reinstalling the VMWare player itself? Since the problem shouldn't persist across the fresh images, maybe somehow player is having an issue.
 
Have you tried reinstalling the VMWare player itself? Since the problem shouldn't persist across the fresh images, maybe somehow player is having an issue.
Could very well be. This is one of the options I'm evaluating right now. I just might do it in the next day or two, maybe after this WU uploads at the start of the weekend.
 
I'd probably do the reinstall first, since its fairly quick and painless to do, before trying the network card. The downside is, you wouldn't find out for a few days whether it worked or not.

It's definitely a problem that's hard to troubleshoot, with all the parts where something can go wrong.
 
Back
Top