Questions about the EVGA VM

APOLLO

[H]ard|DCer of the Month - March 2009
Joined
Sep 17, 2000
Messages
9,089
How many people have used the EVGA VM for processing regular SMP units? I tried this VM on several systems and all but one showed less performance than my regular distro. In some cases, the performance drop is huge. Also, the amount of memory consumed by the OS seems to be more than my regular distro even though there's no GUI. I find both of these observations very odd. Anyone else noticed the same thing?
 
I'm interested as well. I tried it out for a few minutes and it didn't work out for me, but I didn't have a lot of time to play around with it so I'd like to see if anyone has any suggestions on how to configure it properly.
 
Yeah, I haven't done any proper benchmarks, but I estimate it takes about 60% longer per frame. It's a lot more detailed, but I think I would rather sacrifice that for much faster production.
 
I'm interested as well. I tried it out for a few minutes and it didn't work out for me, but I didn't have a lot of time to play around with it so I'd like to see if anyone has any suggestions on how to configure it properly.
The VM seems to get updated quite regularly. Maybe the newest version will work better.

Yeah, I haven't done any proper benchmarks, but I estimate it takes about 60% longer per frame. It's a lot more detailed, but I think I would rather sacrifice that for much faster production.
I would too and wondering if it would be better to run a lighter distro of one's choice even for -bigadv units instead of the EVGA VM. I'm quite astonished at some of the poor performance results I was seeing with standard WUs on this custom VM. :confused:
 
I ran a few 1920 pt WUs with the 0.4 version and was hoping for better. My ppd averaged 8600-8800 depending on the WU, using -smp 8, 5200 MB RAM for the VM and both GPUs running.
 
I would too and wondering if it would be better to run a lighter distro of one's choice even for -bigadv units instead of the EVGA VM. I'm quite astonished at some of the poor performance results I was seeing with standard WUs on this custom VM.
To be honest, I think NotFred's would be perfect if it had even a CLI. I think the ability to specify flags, settings, etc would be really useful. It's really interesting, though, since the EVGA VM doesn't seem to be that much more heavy, and yet performance is so much worse. Is this a kernel thing?
 
Is this a kernel thing?
I wish I knew enough about Linux to answer without a doubt, but I'm certain that kernel version didn't seem to make much of a difference with standard SMP and 4 threads, from what I saw. I don't know if it makes a difference with 8 threads. I wish Alan was here to answer at least a few of the Linux questions that seem to be popping up lately. Alas... :(

Regarding -bigadv, kernel version has been reported by a lot of people on the FF to make a significant difference, because there's less efficient scaling beyond 4 threads with specific kernel versions. IIRC, we're talking several minutes TPF difference in the -bigadv units between less and more efficient kernels. I haven't checked to see which kernel version the EVGA VM is using. I don't even know which Linux distro it's based on, but I'm sure all the information is on the EVGA forum.
 
I don't even know which Linux distro it's based on, but I'm sure all the information is on the EVGA forum.
It's likely built from scratch. Small, minimal, Linux images like this usually are. His kernel is definitely optimized for bigadv processing. Apollo, I am curious to know what CPU architectures you are trying this on with regular SMP units?

Edit: The kernel he uses in v0.7 is 2.6.32.2 which should be fine. I'm not sure about the earlier versions.
 
Last edited:
Apollo, I am curious to know what CPU architectures you are trying this on with regular SMP units?
Dual-core Opterons and dual-core Xeons (Woodcrest) both in dual socket systems (4 cores total each).
 
Dual-core Opterons and dual-core Xeons (Woodcrest) both in dual socket systems (4 cores total each).
I would definitely try v0.7 which has the latest kernel. The kernel, at least on the Intel side, was compiled specifically for Core2 and newer Xeons (this implies all processors after Core2's). Looking at the kernel .config, there are some other performance adjustments made but nothing that should hinder performance on dual cores. I know nothing about his AMD kernel .config, however. I presume you are running the FAH client with "-smp 4" and not just a blank "-smp" to let the client determine the number of threads?
 
I would definitely try v0.7 which has the latest kernel. The kernel, at least on the Intel side, was compiled specifically for Core2 and newer Xeons (this implies all processors after Core2's). Looking at the kernel .config, there are some other performance adjustments made but nothing that should hinder performance on dual cores.
I tried v0.7 as well as the v0.06 prior to it. The latest build is still in the developmental phase from what I read. In any case, I also tried the earlier version on one of my octal-core systems (Clovertown) and it also showed relatively poor performance. I don't know, maybe my regular distro is very, very light even with a GUI.

I know nothing about his AMD kernel .config, however. I presume you are running the FAH client with "-smp 4" and not just a blank "-smp" to let the client determine the number of threads?
I didn't try the -smp 4 flag on any system before since the client automatically launches 4 threads with the -smp flag. Should I add the 4 as well?
 
In any case, I also tried the earlier version on one of my octal-core systems (Clovertown) and it also showed relatively poor performance.
You did select the -amd specific kernel when you configured the image, correct?

I don't know, maybe my regular distro is very, very light even with a GUI.
Possibly but unlikely. 90% of your Linux performance is kernel based and how it was compiled and configured. linuxrouter's image should be very minimal. What is your regular distro?

I didn't try the -smp 4 flag on any system before since the client automatically launches 4 threads with the -smp flag. Should I add the 4 as well?
naww, if it is launching four threads, it is working as if you explicitly stated "-smp 4" I always explicitly state values like this but I am anal and like to be explicit as possible. :rolleyes:

In any event, I've initiated a discussion with linuxrouter about this and awaiting his reply. I was already talking to him about a couple other things anyway.
 
Apollo, I heard back from linuxrouter.. first my message to him, followed by his response:

Hey man. Would there be anything in your kernel config that would cause poorer performance, compared to NotFred or a native Ubuntu install, when running as "-smp 4" on quad core systems (or 2xDual Core servers) with regular A2 WU's? A friend of mine on my team is reporting definite slow downs when he has tried your image.

Thanks for your time.

I have not had the chance yet to test out the VM image on an older quad system since my main Windows system is an i7. I did do a comparison recently with Ubuntu 9.10 and Notfred using the i7 and -smp 8. The frame times were close. These were the frame times that I saw for project 2662:

My image ( 2.6.32-2 ) - 2:58-2:59
Ubuntu ( 2.6.31-16-generic) - 2:59-3:01
Notfred ( 2.6.30.1 ) - 3:00 - 3:01

I have been running a similar kernel on a 3.0 GHz Q6600 (-smp 4) natively. With project 2667, this system has been getting TPF of 5:11 although I do not have anything to compare it to at the moment.

One thing I did notice in Virtualbox was that the framebuffer which allows for higher resolution in the console was slowing things down a bit. When I booted with vga set to normal, folding performance was better for some reason. This change did not seem to make as much of a difference in VMware, but it might be worth a try. This would be the command to enter on boot:

2.6.32.2-intel vga=normal

I will look over the kernel config to see if there might be something that would slow down the older quad cores and dual socket systems.
 
You did select the -amd specific kernel when you configured the image, correct?
I believe so, but will double check in the case for my Opteron system, however even if I did choose the wrong kernel for that system it still won't explain why nearly all my Xeon systems also saw a decrease in performance. :confused:

Possibly but unlikely. 90% of your Linux performance is kernel based and how it was compiled and configured. linuxrouter's image should be very minimal. What is your regular distro?
This one here: http://distrowatch.com/table.php?distribution=fluxbuntu

In any event, I've initiated a discussion with linuxrouter about this and awaiting his reply. I was already talking to him about a couple other things anyway.
Thank you for raising my issues directly to linuxrouter. I will try his recommendations. One thing I want to remention is the performance inefficiency compared to my regular distro was also observed on two of my octal-core Clovertown systems, so it doesn't seem to be confined to dual-core processors. This is again with standard SMP WUs but running the -smp 8 flag. There isn't sufficient resources on these systems to run -bigadv, unfortunately.
 
One thing I want to remention is the performance inefficiency compared to my regular distro was also observed on two of my octal-core Clovertown systems, so it doesn't seem to be confined to dual-core processors.
Yeah, I probably should have mentioned that. The two kernels are totally different, figured I'd hit him with the Intel problem first. :rolleyes:
 
Apollo, I heard back from linuxrouter.. first my message to him, followed by his response:
Thanks for your effort and the communique from LR. I tried his suggestion of restricting the resolution but like he mentioned, it didn't seem to make much of a difference in VMware. However, I like it better because it makes the VM window much smaller and thank him for the suggestion. Also, he tested an Ubuntu distro built upon a Linux kernel that might be poorer in standard SMP performance. Look to this page for additional info: http://en.fah-addict.net/articles/articles-1-23+comparison-between-linux-kernels.php
 
Apollo, here is another message from linuxrouter, definitely give this a try:

I built a couple older kernels to see if these might help with -smp 4 performance. It can be installed via v0.6-0.7 using this command in the console:

upkernel http://www.linuxforge.net/fah/img/kernels-vm.tar.bz2

It includes 2.6.28 and 2.6.31. I disabled the SMT scheduler for processors that do not support HT and enabled NUMA and ScaleMP support. These also have the 100 Hz timer freq and no-preempt like the 2.6.32 build. The kernel can be selected on boot.
 
Wow, I will definitely check that out. Thanks Tobit! :cool:
 
Apollo, have you tested any of these kernels yet?
 
Apollo, have you tested any of these kernels yet?
Hi Tobit, I haven't time to test the kernels yet. One of the reasons is the the target EVGA VM that I want to test the different kernels has been inundated with huge A1 WUs since last week. I'm speaking of the P5100-series. They seem to take forever to complete. I'm going to switch kernels as soon as I see this current WU finish and let you know. It will be hard to gauge against the A2 WUs if I still continue to receive A1s though. I'll let you know either way.
 
OK, I conducted some preliminary testing on my long-in-the-tooth Opteron system (dual 280s @2.8GHz). Unfortunately, the EVGA VMs seem to be attracting A1 WUs and no matter what client configuration or flags I used, that's all I seem to be getting from Stanford lately. With the current P5101 WU, the original kernel seems to be the best performer. I'm getting ~960 PPD with that kernel, about 950 PPD with the 2.6.31 kernel and the 2.6.28 kernel was only producing in the mid 700s PPD. I will wait to see if I'll receive an A2 WU and continue testing on that system. This is an XP 32-bit host machine.

In regards to memory settings, I don't believe I have NUMA enabled in the BIOS. Last time I tried that, I didn't notice any improvement with SMP WUs, but for some reason my total system memory was detected by XP as being a lot less, so I disabled it. Whatever benefits I'd see with NUMA was offset by the reduction in detectable RAM. I'm intending to install Win 7 X64 on this system. I'll try fooling with NUMA settings again after I install it. I got a feeling that -bigadv might see a change with this but not regular SMP. Standard SMP is nowhere near as memory intensive.

Then there's the E5100-series dual dual-core Xeon system I have. I didn't try the different kernels on that one yet. I'll probably get around to doing it sometime this weekend and let you know. Even if this exercise doesn't produce any improvements in performance over what I have now, it might reveal interesting performance aspects with various setups and that could prove useful in the future. The fact that we can select between kernels is advantageous alone, and wish I could arrange my regular distro this way. Alas, I don't know enough about Linux to be able to install and select between various kernels.
 
Back
Top