SMP FAH on VMware

Where are you at in the process?

Unhappy_mage, in general, has good documentation, but school is beating him down at the moment.
 
i think i got it to updat F@H, but if i reboot the VM i get this screen, i still haven't even got to the screen to enter a team or user name yet.

any idea on what to do ?

denied1.png
 
logout and log in as:

login: root
Password: Fold@on

Linux is very permission finicky
 
Hit Alt F1 to get to the log in screen
Login: Root
Password: Fold@on
Then type cd /home/fold to get back to the folding folder.
Then type ./fah5 to start it.
Config your client and your good to go.

Just re-installed it so its easy the second time around.

Luck ............. :D
 
Tigerbiten said:
Hit Alt F1 to get to the log in screen
Login: Root
Password: Fold@on
Then type cd /home/fold to get back to the folding folder.
Now download the new client and install it.
Then type ./fah5 to start it.
Config your client and your good to go.

Just re-installed it so its easy the second time around.

Luck ............. :D

nice, thanks, i got it running somehow, not sure but i think Hito Bahadur gave me some clues :)

how would i check on PPD in the VM ?

here is what it looks like now............

http://x010.uploaderx.net/x/starting2.jpg
 
In a bit it should start posting 1% done, 2% done, etc. Just take the time interval, multiply by 100, divide by 60, divide by 24, 1/x, multiply by 587. Should give you your PPD.
 
Time how long it takes to do a few frames.
Then from that work out your frames per hour.
Then if you got a 587 pointer, its 141 x FpH = PpD.

Luck ........... :D
 
i think the clock or something might wrong in VM, like if i reboot it, it gives me an option to hit the Y key to do a file integrity check, but the seconds count down really slow.

also the first 2 steps in FAH timed like 2:17 acording to VM, but it was longer in real time.

any ideas on this ?
 
here is a screenshot, in task manager, take about 1minute, 20 seconds off for while VM loaded, but look at how long VM says FAH took, wierd huh ?

is there a way i can delete the work and restart it over without starting a new VM ?

i may have checked yes to the advanced option to allow endless work units or whatever, would that make a difference ?

timeoff1.jpg
 
No it wouldn't. Let it run for a few frames. It might have been short because it was a partial time (their are waypoints). It should normalize after on the next frame.
 
Hito Bahadur said:
Had already done that and move the .vmx to the folder that was created as a result. Error was after that event.

edit: nm, I'm a tard. I had unpacked the 7z program and thought for some reason that was all I had to do.

I did the same thing at first :D I figured the 7z program was an auto-extractor
 
here's one thing to watch out for - when you configure the SMP FAH client the first time, make sure you allow for big work packets. I didn't, and I couldn't upload the first workunit I finished due to the packet size being too large. It's a known issue on the Stanford forums. The fix involves downloading something called qfix and running that, but since i'm such a linux noob, I really couldn't figure out how to even do that. So I just deleted the whole thing and started over - this time I configured it correctly.


If any of you didn't enable bigpackets when you first started the SMP FAH, I'm not saying that you're automatically not going to be able to upload your results, but its something to be aware of.

meisterbrau

edit - snake, I'm getting the exact same thing with my system clock. Got real excited at first when the first frame finished in about two minutes (or so I thought). :eek:
 
Well I got everything loaded up finally. Taking around 7min per %

Due to vmware only utilizing ~80% of CPU cycles, I added another virtual machine and it seems to be cooperating. 100% CPU utilization now :cool: CPU usage alternates but on average it is distributed about 50/50.
 
Hito Bahadur said:
Where are you at in the process?

Unhappy_mage, in general, has good documentation, but school is beating him down at the moment.
QFT :p

Can someone try a simple fix for me? Edit the .vmx file you've got, and on the last line change:
Code:
from:  tools.syncTime = "FALSE"
to:    tools.syncTime = "TRUE"
and see if you still get incorrect frame times, "lost ticks" messages, etc. If it's that simple I'll feel stupid...

whrswaldo: Running 2 VMs on a single machine will slow them down and may ultimately hurt your PPD. They're probably pounding memory hard and that's what keeps the CPU usage from hitting full blast. The SMP client on real machines only uses ~80%, too.
 
unhappy_mage said:
QFT :p

Can someone try a simple fix for me? Edit the .vmx file you've got, and on the last line change:
Code:
from:  tools.syncTime = "FALSE"
to:    tools.syncTime = "TRUE"
and see if you still get incorrect frame times, "lost ticks" messages, etc. If it's that simple I'll feel stupid....

can you explain where and how to edit this, i will try if i know what to do.
 
the snake said:
can you explain where and how to edit this, i will try if i know what to do.
Thanks for volunteering!

Power off the VM and close the VMware console. Right-click on "Red Hat Enterprise Linux 3 64-bit.vmx" (if you've got the file extensions turned off, it's the 1kb one) and select Open With... and pick Notepad. Then change FALSE to TRUE on the last line as mentioned. Re-open it and power back on and see if it works better.

Another way to check would be to run the program "date" on the command line, write down the wall-clock time and the results of "date", come back in an hour and run "date" again and compare wall-clock time to the difference in "date". They should be the same to within a few seconds.
 
unhappy_mage said:
] VMware Server only goes to 2 processors, I think you need GSX (which is like $10k!) to get 4 processor virtualization.

Rats! Just when I was getting excited about the prospect of keeping my K8N-DL/opty 265 oc'd in XP, and running an SMP client in VM. :eek: Oh well, guess I'll just continue folding SMP in ubuntu "live"; a little slower than hdd install, but it works. Every try I've made(with hdd install) to install the AMD64 smp kernal ends up in "kernal panic". :(
 
Well, you could run 2 instances of the VM. That would use all 4 cores.
 
Tried changing from "tools.syncTime = FALSE" to "tools.syncTime = TRUE".
Both on an existing install and on a new one.
Made no difference to the FAH timer.

If you need to shut the VM down, take a snapshot of it.
When you restart the VM, load the snapshop and it picks up straight where it left off.
No need to load the FAH client.
You also skip any permissions errors.

Luck ............ :D
 
Tigerbiten said:
Tried changing from "tools.syncTime = FALSE" to "tools.syncTime = TRUE".
Both on an existing install and on a new one.
Made no difference to the FAH timer.


I made this edit and it seemed to do the trick for me. I can't tell if the timer is exactly right - I'll try to check tonight, but the timer is now reporting a frame every 8 or 9 minutes (it varies) which looks like it matches up with real time. The timer was showing 1 or 2 minutes per frame before.
 
unhappy_mage said:
QFT :p

Can someone try a simple fix for me? Edit the .vmx file you've got, and on the last line change:
Code:
from:  tools.syncTime = "FALSE"
to:    tools.syncTime = "TRUE"
and see if you still get incorrect frame times, "lost ticks" messages, etc. If it's that simple I'll feel stupid...

whrswaldo: Running 2 VMs on a single machine will slow them down and may ultimately hurt your PPD. They're probably pounding memory hard and that's what keeps the CPU usage from hitting full blast. The SMP client on real machines only uses ~80%, too.

I tried editing the time config change, and near as I can tell time progress matches up, although it is still not synched to my system time.

I turned in a WU this morning, so I guess everything is working.

Ughh now I get to lock myself in the library until tomorrow, I hate tests.
 
What the hell are you guys doing to this poor thing? It's a virtual machine, "Just Works" TMM (trade mark mage) should be the case.

These things are easy!!!
 
unhappy_mage said:
Thanks for volunteering!

Power off the VM and close the VMware console. Right-click on "Red Hat Enterprise Linux 3 64-bit.vmx" (if you've got the file extensions turned off, it's the 1kb one) and select Open With... and pick Notepad. Then change FALSE to TRUE on the last line as mentioned. Re-open it and power back on and see if it works better.

Another way to check would be to run the program "date" on the command line, write down the wall-clock time and the results of "date", come back in an hour and run "date" again and compare wall-clock time to the difference in "date". They should be the same to within a few seconds.

i tried editing that in notepad, didn't change anything in the FAH time, i even started a totaly new VM with it edited, but it still gives me wrong time, i can live with that i guess.

i did finish a #3020 work unit this morning, but it will not send the results it seems, when i restarted FAH in VM it says " lenght of work/wuresults _01.dat excedds packet limit set (5241856)" or something like that.

does anyone know how to force it to upload ?

how can i stop FAH and then restart it ?
 
marty9876 said:
What the hell are you guys doing to this poor thing? It's a virtual machine, "Just Works" TMM (trade mark mage) should be the case.

These things are easy!!!

yeah - I'm ok with the clock. I made the change as mage recommended, and its a lot better, but it still seems to lag real time. Maybe by about 20% - I compared the clocks before I went to work this morning and when I just got back. No biggie.
 
the snake said:
i tried editing that in notepad, didn't change anything in the FAH time, i even started a totaly new VM with it edited, but it still gives me wrong time, i can live with that i guess.

i did finish a #3020 work unit this morning, but it will not send the results it seems, when i restarted FAH in VM it says " lenght of work/wuresults _01.dat excedds packet limit set (5241856)" or something like that.

does anyone know how to force it to upload ?

how can i stop FAH and then restart it ?
To allow the WU to upload, you'll need to turn on "Big Packets" so the results can be uploaded back to Stanford. To restart FAH, hit alt-F1, log in as "root" with password "Fold@on", and run "service folding restart". I don't know if it'll allow the upload or remember that you had bigpackets off when the WU finished. Try it out, and if no luck I'll look into qfix.
 
Worked out how to get it to auto-start after a reboot.

Now I need to solve the permission error I get when trying to start FAH.
When it first starts up in get "/home/fold/startfah: line 3: ./fah5: Permission denied".
If I Alt-F1, log in with root & Fold@on, cd /home/fold, ./fah5 then it starts.
The line looks like .... [root@localhost fold]# ./fah5
Anything else I try like "service folding start" will give me "/home/fold/startfah: line 3: ./fah5: Permission denied" again.

Also.
When running this with the GPU client how badly is either one slowed down ?
This gives me ~600 PpD but not running anything else.
The GPU solo would give me ~800 PpD but I could add a CPU client for an extra 120-250 PpD.

Luck .............. :D
 
Can someone please help with this darn error message. The VMware installed fine on one of my dedicated folding machines but on my main rig it gives this damn error message, what's up with this?

 
chemist_slime said:
Can someone please help with this darn error message. The VMware installed fine on one of my dedicated folding machines but on my main rig it gives this damn error message, what's up with this?

Looks like a networking problem .
Look here
Hope that gives you a clue to whats going wrong.

If not look up "error 10013" in the VMTN Knowledge Base.

Luck .............. :D
 
When you start it make sure you tell it to connect to the local client, not a networked one - I think that's what is causing your problem. If you want to connect to one of the dedicated boxen to run off that VMware, it may be a firewall issue on the other machine. I'd just run local though.

the black knight always triumphs!
 
Tigerbiten said:
Now I need to solve the permission error I get when trying to start FAH.
When it first starts up in get "/home/fold/startfah: line 3: ./fah5: Permission denied".
If I Alt-F1, log in with root & Fold@on, cd /home/fold, ./fah5 then it starts.
The line looks like .... [root@localhost fold]# ./fah5
Anything else I try like "service folding start" will give me "/home/fold/startfah: line 3: ./fah5: Permission denied" again.
Can you show me the output of "cd /home/fold; ls -l startfah fah5"? It should give a bunch of letters at the beginning, like "rwxr-xr--" or something, and then the owner of the file ("root" or "fold"), and then the group of the file.

My guess is running "chmod a+x fah5; chown fold:fold fah5" will fix things. Chmod means change mode; it'll make permissions for All to eXecute. Chown means change owner; it'll make fold the owner and fold the group on the fah5 file. I should probably make that part of the startup script, to prevent further confusion.
 
unhappy_mage said:
Can you show me the output of "cd /home/fold; ls -l startfah fah5"? It should give a bunch of letters at the beginning, like "rwxr-xr--" or something, and then the owner of the file ("root" or "fold"), and then the group of the file.

My guess is running "chmod a+x fah5; chown fold:fold fah5" will fix things. Chmod means change mode; it'll make permissions for All to eXecute. Chown means change owner; it'll make fold the owner and fold the group on the fah5 file. I should probably make that part of the startup script, to prevent further confusion.

I had to do a chmod on the executable, as i downloaded the new client as root.

Everything working great now.
 
It looks like .........

[root@localhost fold]# cd /home/fold; ls -l startfah fah5
-rwx-------- 1 8372 rpm 249684 nov 27 13:36 fah5
-rwxr-xr-x 1 fold fold 37 nov 24 15:59 startfah

So I did the "chmod a+x fah5; chown fold:fold fah5"
And now it looks like .....................

[root@localhost fold]# cd /home/fold; ls -l startfah fah5
-rwx--x--x 1 8372 rpm 249684 nov 27 13:36 fah5
-rwxr-xr-x 1 fold fold 37 nov 24 15:59 startfah

It starts BUT goes straight in to the config section which just loops.
./fah5 start it correctly after logging in with root etc, etc.
"service folding start" goes into a config loop.

Luck ............ :D
 
Okay. My guess is the client.cfg file is also owned by root, and thus "fold" doesn't have permission to access it, so it goes off looping. Try "chown -R fold:fold ~fold". ~username means the home directory of username, so that'll make fold own everything again. Then it should work under "service mode".
 
"chown -R fold:fold ~fold" = :D

That the line thats needed for my folders to work.
Next time they are shut down I will give them a good 5-10 mins before monitioring and see if they have started running in the back ground.
And if so at what speed.

Now its seeing what will happen when a GPU client is add to this box.
Getting new Vid cards on monday.

Thanks U-M.
Luck ............... :D
 
Hey UM, guess you got chowned again, huh...

(Had to say it...)

chemist_slime: Has VMware started working for you yet? If it hasn't I'll grab it and see if I can replicate that specific error. I am lacking both in SMP and 64-bit so that's as far as I can go, but I think I know where your problem is coming from.

the black knight always triumphs!
 
okay i am trying this once more, this time i allowed large work units, to make it upload when it is done, hopefully that works.

now about the CPU time usage, i did a fresh VM install about 5 hours ago, and just checked the CPU time in task manager, it says 8 hours, what's that all about ?

this on my 170 opty @ 2.52

5hours.png
 
Back
Top