Since people asked, I'm posting the test application here, which I made for testing the speed of my multithreaded MarchingCubes algorithm.
(Don't bother asking for source, I'm not at liberty to release it. This test-program is based on so much more than just some simple testing code, and its sole purpose is to examine performance on various systems and try to avoid any bottlenecks).
I've made a few different versions, one of them using shared memory, which is a good way to test the speed of inter-core communication. As we found out on another forum, the HT-speed on Athlon64 affects this version... which was a bit of a surprise, since I've always heard people say that AMD uses a fast internal crossbar for intercore-communication, which would be BEHIND the HT-logic, so not dependent on it. But the test results look more like the crossbar *is* the HT-logic itself, and the communication between cores is done in a similar way as a system with two physical sockets/CPUs.
Would be interesting to see how a real dual-CPU Opteron system performs on this code.
The other version tries to run each thread as independently as possible. Which turned out to be considerably faster, even on a system with a shared cache.
Anyway, here's the first version, which is nice for measuring core-communication:
http://scali.eu.org/~bohemiq/Fire.rar
Fire.exe is the old single-threaded version
Fire-Multithread.exe does quite a bit of shared memory processing
Fire-Multithread2.exe is a first version that avoids the shared memory as much as possible.
And this is a later version, where I optimized the second version even more, and also put in a control to choose the number of threads. This one also contains a 64-bit version. I've not yet found anyone who could run it for me on a 64-bit AMD system.
http://scali.eu.org/~bohemiq/FireNew.rar
(Don't bother asking for source, I'm not at liberty to release it. This test-program is based on so much more than just some simple testing code, and its sole purpose is to examine performance on various systems and try to avoid any bottlenecks).
I've made a few different versions, one of them using shared memory, which is a good way to test the speed of inter-core communication. As we found out on another forum, the HT-speed on Athlon64 affects this version... which was a bit of a surprise, since I've always heard people say that AMD uses a fast internal crossbar for intercore-communication, which would be BEHIND the HT-logic, so not dependent on it. But the test results look more like the crossbar *is* the HT-logic itself, and the communication between cores is done in a similar way as a system with two physical sockets/CPUs.
Would be interesting to see how a real dual-CPU Opteron system performs on this code.
The other version tries to run each thread as independently as possible. Which turned out to be considerably faster, even on a system with a shared cache.
Anyway, here's the first version, which is nice for measuring core-communication:
http://scali.eu.org/~bohemiq/Fire.rar
Fire.exe is the old single-threaded version
Fire-Multithread.exe does quite a bit of shared memory processing
Fire-Multithread2.exe is a first version that avoids the shared memory as much as possible.
And this is a later version, where I optimized the second version even more, and also put in a control to choose the number of threads. This one also contains a 64-bit version. I've not yet found anyone who could run it for me on a 64-bit AMD system.
http://scali.eu.org/~bohemiq/FireNew.rar