Just how fast is VMXNET3?

Zarathustra[H]

Extremely [H]
Joined
Oct 29, 2000
Messages
39,030
So,

I was under the impression that vmwares E1000 emulated gigabit adapter is just that, gigabit, and that VMXNET3 is a virtual 10gbe adapter.


But then I just ran some iperf testing I can't wrap my head around:

Code:
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local xxx.xxx.xxx.xxx port 5001 connected with xxx.xxx.xxx.xxx port 42984
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  15.4 GBytes  13.3 Gbits/sec
[  5] local xxx.xxx.xxx.xxx port 5001 connected with xxx.xxx.xxx.xxx port 27820
[  5]  0.0-10.0 sec  21.4 GBytes  18.4 Gbits/sec

Configuration is two guests on the same host:

Guest1 ---VMXNET3---> vSwitch ---VMXNET3---> Guest2

Is VMXNET3 not strictly tied to any particular speed? If that is the case, why does it stop at 13.3 or 18.4Gbit and not go higher? CPU limitation?

This is on a dual 6 core LGA1366 Xeon (12 cores +HT @2.25, turbo to 2.8Ghz)

Appreciate your thoughts.

--Matt
 
Last edited:
The results seem quite good regarding what you have already. You can play with the settings a bit more until you've hit a sweet spot, but it's hard to tell. Try -P 10 and -w 65536 for more parallel sessions and an increased TCP window size.

Also make sure the VMs you use for testing have enough vCPUs to handle the load you're pushing through the virtual wires. Jumbo frames mind help too.

The 10 Gbit speed of the VMXNET3 is not in any way guaranteed, even between VMs on the same ESXi host. It will both be the host CPU performance as well as the guest TCP stack implementation that sets the limit. What OS; Linux or Windows?
 
The results seem quite good regarding what you have already. You can play with the settings a bit more until you've hit a sweet spot, but it's hard to tell. Try -P 10 and -w 65536 for more parallel sessions and an increased TCP window size.

Also make sure the VMs you use for testing have enough vCPUs to handle the load you're pushing through the virtual wires. Jumbo frames mind help too.

The 10 Gbit speed of the VMXNET3 is not in any way guaranteed, even between VMs on the same ESXi host. It will both be the host CPU performance as well as the guest TCP stack implementation that sets the limit. What OS; Linux or Windows?

These were run between a BSD install and a Linux install.

I guess what struck me was that they are rated at (I thought) 10Gbit/s, but I am benching them at 13 and 18 Gbit/s respectively. Is that some sort of error? How can I get results above the rated speeds?
 
1 Gb or 10 Gb is an ethernet media limit.
Virtual nics (transfer from vnics to vswitch) are pure software and only limited by hardware performance and software efficiency.
 
1 Gb or 10 Gb is an ethernet media limit.
Virtual nics (transfer from vnics to vswitch) are pure software and only limited by hardware performance and software efficiency.

This is what I've seen / heard. I myself have gotten over 1 Gb/s with E1000. I use VMXNET3 for my FreeNAS VM and get 600-700 MB/s reads over iSCSI to other VMs.
 
This is what I've seen / heard. I myself have gotten over 1 Gb/s with E1000. I use VMXNET3 for my FreeNAS VM and get 600-700 MB/s reads over iSCSI to other VMs.

Nice.

I don't use iSCSI myself, as I prefer the flexibility of being able to access the content from multiple sources, so I use mostly NFS for my server stuff, and mostly SMB for my client stuff.

I just did some basic read benches with dd from a 33Gig vmware disk image on FreeNAS to /dev/null on my Ubuntu Linux guest

Benched read speeds were all over the place (presumably, because I am not controlling for what else is running at the same time, just a quick test via ssh from work) but the highest I got was 797 MB/s

Code:
$ dd if=./Windows\ 7\ x64.vmdk of=/dev/null bs=1024k count=50k
31893+1 records in
31893+1 records out
33442693120 bytes (33 GB) copied, 41.96 s, 797 MB/s

Not too shabby.

Hopefully when I get my transcievers and fiber for my Brocade BR-1020 adapters (they are on the slow boat from Shenzhen) they will be able to keep up, and provide me this type of speed straight to my workstation.
 
Zarathustra[H];1041475845 said:
I don't use iSCSI myself, as I prefer the flexibility of being able to access the content from multiple sources, so I use mostly NFS for my server stuff, and mostly SMB for my client stuff.
I use iSCSI for VMs on the same host as the FreeNAS VM b/c using NFS w/ ESXi makes all writes sync writes and I don't have a dedicated SLOG

This is explains it much better than I can
https://forums.freenas.org/index.ph...xi-nfs-so-slow-and-why-is-iscsi-faster.12506/

Zarathustra[H];1041475845 said:
Benched read speeds were all over the place (presumably, because I am not controlling for what else is running at the same time, just a quick test via ssh from work) but the highest I got was 797 MB/s
Thats real nice speed.
I need to try from a linux VM as well. I think i got my numbers from running crystalmark on a win server 2012 VM.

Zarathustra[H];1041475845 said:
Hopefully when I get my transceivers and fiber for my Brocade BR-1020 adapters (they are on the slow boat from Shenzhen) they will be able to keep up, and provide me this type of speed straight to my workstation.

I'm interested in this. What kind of cost is a 10 gig fiber setup? I was looking at linking my ESXi hosts by 10 gig infiniband which is surprisingly cheap(35-70 per adapter 150-250 for a switch). Then I'd have 10 Gb/s connectivity on my desktop/workstation (which is a VM). I haven't looked into fiber at all as I just assumed it was as much if not more than 10 gig ethernet.

Edit: looking closer at your dd test ...isn't your freenas copying that to it's local /dev/null how is the data crossing virtual network?
 
Last edited:
I'm interested in this. What kind of cost is a 10 gig fiber setup? I was looking at linking my ESXi hosts by 10 gig infiniband which is surprisingly cheap(35-70 per adapter 150-250 for a switch). Then I'd have 10 Gb/s connectivity on my desktop/workstation (which is a VM). I haven't looked into fiber at all as I just assumed it was as much if not more than 10 gig ethernet.

Generally the cost of fiber is pretty high, not because of the NIC's or fiber themselves, but because of the transcievers. A lot of hardware refuses to play nice, unless you use transcievers of the same brand (presumably this is intentional to corner you into buying their products. You know, the "inkjet cartridge" model)

There are some exceptions though.

Brocade BR-1020 adapters seem to be the best deals going right now. If you are patient, you can usually find a dual port BR-1020 adapter for under $40 a piece (Mine were the single port BR-1010 versions)

Then you need to figure out what you want to do for cabling. For short runs, you don't have to use transceivers and fiber at all. They have what called "Direct Attach Copper" or Twinax cables. These are typically for 5M (16ft) or shorter lengths though. (Some go up to 7M)

Short ones are cheaper than getting transceivers and fiber, longer ones can be more expensive.

If you go the fiber route, and decide to buy branded transceivers, you'll spend a lot on them. There are however third party sellers that guarantee model specific compatibility. Fiberstore.com does this, but they ship straight from China so you may have to wait a bit. They have a pretty good reputation on the Servethehome forums, and they take paypal and are one of those Google guaranteed stores, so there is some protection.

I needed a longer line, so I ordered fiber from them. Transceivers were $18 each (you need one on each end of the fiber) and the LC duplex OM3 fiber cables I bought were $6 each, for 15M (50ft) lengths.

As long as you don't need a 10Gbe switch you should be fine. Those are still rather expensive. Cheapest I have found is the 24 port XGS1910-24, 24 port gigabit ethernet, with the last 2 ports dual personality for gigabit fiber, plus two dedicated 10gig uplink ports. These run about $500 (+ transceivers), and there are a few users in these forums who speak well about them. If you want something like this from Cisco or HP though, it's going to cost in the $1,200 range. This is why I am just running a direct line between two boxes. I can't justify spending more on a switch now. If I didn't already have my ProCurve 1810G-24, which I spent $300 on a few years back, I might have done it. Instead I am just running a team of 4 gig-e lines to the switch from the server.

Still waiting for mine. Will report back when I get them.
 
Last edited:
I use iSCSI for VMs on the same host as the FreeNAS VM b/c using NFS w/ ESXi makes all writes sync writes and I don't have a dedicated SLOG

This is explains it much better than I can
https://forums.freenas.org/index.ph...xi-nfs-so-slow-and-why-is-iscsi-faster.12506/

Yeah, I'm familiar with this. Did you know that these are only the default settings though? By default, NFS obeys the sync settings of the server, and by default iSCSI ignores them and just runs sync=off.

If you don't want sync with NFS, you can either tell it (in the mount options) to do the same thing iSCSI does (ignore sync settings of server and just do async writes) OR, the better option is to create datasets for different types of data, and individually set sync off or on depending on the importance of the data in each set.

I have a mirrored pair of Intel 100gig S3700 SSD's I use for my SLOG, so I don't have as big performance impact from using sync writes (but there is still some).

I have a few different datasets configured. The ones where performance outweighs reliability have sync set to off. The ones where reliability is key have sync set to on.

I would be concerned running guests off of iSCSI without sync (even though I know it is common in industry) as if you have even a single write failure it can render the entire disk image inoperable and unrecoverable. At least with individual files, a write error like that will only impact the currently open file, but with an iSCSI disk image, the currently open file is the entire image. Frequent backups would be a good idea I guess.

I played around with iSCSI when I first set up my server, but determined it wasn't for me. The lack of flexibility due to the image being owned by one initiator at a time was a bummer, as was the waste of disk space by pre-allocating all of your disk space to the image. (I know you can create a sparse image file, but I read this has reliability issues). The lack of sync writes sealed the deal for me. I put up with he poor performance for a while, until I got the SLOGs.

I don't quite understand why iSCSI is so popular in enterprise use. To me it seems it has so many drawbacks. Maybe it just gets put into place because that's what people are used to?
 
We mostly use ISCSI to present central storage to VMs so we don't have to mess with RDMs. Anything physical is hooked up to the Fiber network.
 
Zarathustra[H];1041477589 said:
Yeah, I'm familiar with this. Did you know that these are only the default settings though? By default, NFS obeys the sync settings of the server, and by default iSCSI ignores them and just runs sync=off.

I don't quite understand why iSCSI is so popular in enterprise use. To me it seems it has so many drawbacks. Maybe it just gets put into place because that's what people are used to?

From the link:
"iSCSI by default does not implement sync writes. However, your VM data is being written async, which is hazardous to your VM's. On the other hand, the ZFS filesystem and pool metadata are being written synchronously, which is a good thing."

My biggest issue is iSCSI keeps the pool as a whole safe. The lack of sync writes on the VMs data is fine b/c i have frequent back ups and its nothing to roll back.

NFS with sync writes turned off can put the entire pool itself at jeopardy.

I use CIFS for all my client transfers (windows and linux)

Edit: also when I setup iSCSI I made a 600 GB extent, but my pools reported free space changes based on the actual data used of the 600 available. I think I'm only using around 150 GB atm.
 
From the link:
"iSCSI by default does not implement sync writes. However, your VM data is being written async, which is hazardous to your VM's. On the other hand, the ZFS filesystem and pool metadata are being written synchronously, which is a good thing."

My biggest issue is iSCSI keeps the pool as a whole safe. The lack of sync writes on the VMs data is fine b/c i have frequent back ups and its nothing to roll back.

NFS with sync writes turned off can put the entire pool itself at jeopardy.

I use CIFS for all my client transfers (windows and linux)

Edit: also when I setup iSCSI I made a 600 GB extent, but my pools reported free space changes based on the actual data used of the 600 available. I think I'm only using around 150 GB atm.

I'm not entirely certain, but I believe writing async via NFS behaves similarly to iSCSI, with filesystem and metadata being syncronously written.

I have been playing with ZFS and FreeNAS for 6 years now, and actively been a member of their forums. I've heard many stories of individual corrupted files due to async NFS writes, but never heard of anyone bringing down their entire pool this way. I have heard plenty of people losing iSCSI images though.

Personally, I don't use ZFS for my guest OS drives though. I found that all of my 10 guests installs are pretty small. Some are only 2 GB, all are under 10GB. (This is what happens when you use Linux and BSD guests without GUI environments :p )

For me it made more sense to mirror a set of SSD's and use a local datastore. I find performance is much higher this way, and it is less complex, as I don't need to worry about my FreeNAS guest coming up, before the other guests can boot. I periodically back up the vmdk files via NFS to my ZFS pool for all but my FreeNAS install which I store a backup on a separate drive. (makes little sense to store my FreeNAS backup in a way it can only be accessed when FreeNAS is working :p )

The ZFS pool is mostly just mass data storage. A fast file server. The guests that need to store lots of data (Like my MythTV backend) just mount the ZFS pool via NFS and write their data directly to the pool.
 
Zarathustra[H];1041477683 said:
I'm not entirely certain, but I believe writing async via NFS behaves similarly to iSCSI, with filesystem and metadata being syncronously written.
If this is true the freenas post I referenced by jgreco is completely wrong(see his option 2). I'm not knowledgable enough about the inner working of ZFS to say who is right or wrong, but upon further reading of that post it looks like other people disagreed about the dangers of running NFS /w sync writes off. So you may very well be right on this.

Zarathustra[H];1041477683 said:
Personally, I don't use ZFS for my guest OS drives though. I found that all of my 10 guests installs are pretty small. Some are only 2 GB, all are under 10GB. (This is what happens when you use Linux and BSD guests without GUI environments :p )
I did this at first, but my entire setup is dual purpose. The primary purpose is my home use (big media lib, Plex, SickRage,Sabnzbd, CouchPotato, Rutorrent, ESXi hosted desktop.)

Other use is lab work and just familiarizing myself with as many enterprise technologies and practices as possible as I work toward various certs.

I could make my set-up a lot easier to manage and run, but I also look for opportunities to mirror a setup more similar to a corporate environment when I can. So I use iSCSI and back up VMs w/ Veeam. My whole network is divided into 5 vlans even though I only have about 20-25 network devices. I have windows DCs even though I only have 1 non ESXi host PC on LAN.
 
So, I don't know what happened, but I installed the in-GUI system updates for FreeNAS 9.3 earlier in the week, and my VMXNet3 speeds have absolutely plummeted!

They have gone from the 18MBit/s above to this: :eek:

Code:
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 3] local xxx.xxx.xxx.xxx port 36620 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 3.46 GBytes 2.97 Gbits/sec

Trying to troubleshoot now, but if the bug/security fixes in the last updates weren't too significant I may just roll back to the copy of the .vmdk I made prior to updating.

(update problems in the past have made me paranoid, these days I always power down a guest, copy its vmdk and then resume it again before doing an update.)
 
Test the other direction. I get around 16-20 Gbps if I run iperf from Freenas as client to Ubuntu as server. Going from Ubuntu as client to Freenas as server I get 3-4 Gbps. Which tome is really weird. I didn't test till after the most 2nd recent update though. So I can't compare to before.
 
Test the other direction. I get around 16-20 Gbps if I run iperf from Freenas as client to Ubuntu as server. Going from Ubuntu as client to Freenas as server I get 3-4 Gbps. Which tome is really weird. I didn't test till after the most 2nd recent update though. So I can't compare to before.

I independently just made this very same discovery before I read your post.

:confused: :confused: :confused: :confused:

~21Gbit/s when Ubuntu is server and Freenas is client
~3.2Gbit/s when Freenas is server and Ubuntu is client.


Crazy.

I'd like to see this one explained...


Two Iperf tests with FreeNAS as a server (iperf -s) and Ubuntu Server as client (iperf -c):

Code:
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local xxx.xxx.xxx.xxx port 33485 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  3.79 GBytes  3.25 Gbits/sec

------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local xxx.xxx.xxx.xxx port 43405 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  3.69 GBytes  3.17 Gbits/sec

Two Iperf tests with Ubuntu Server as a server (iperf -s) and FreeNAS as client (iperf -c):

Code:
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[  3] local xxx.xxx.xxx.xxx port 53941 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  24.9 GBytes  21.4 Gbits/sec

------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[  3] local xxx.xxx.xxx.xxx port 64555 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  25.5 GBytes  21.9 Gbits/sec
 
And what do you know, the same is true over the brocade adapters with transceivers and fiber.

When my linux worstation is the server, I practically max out the 10GBASE-SR.

Code:
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 32.5 KByte (default)
------------------------------------------------------------
[  3] local xxx.xxx.xxx.xxx port 62948 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  10.8 GBytes  9.30 Gbits/sec

But when FreeNAS is the server I'm back down to slow speeds.

Code:
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[  3] local xxx.xxx.xxx.xxx port 40512 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  3.64 GBytes  3.13 Gbits/sec

I wonder if this is an iperf anomaly, or if the same is evident with file sharing protocols...
 
Zarathustra[H];1041497752 said:
I wonder if this is an iperf anomaly, or if the same is evident with file sharing protocols...

Hmm. And this one is unclear. I get similar reads (From FreeNAS) as I do writes (to FreeNAS) from my desktop, but it may be a drive speed limitation. I'd have to create and share a ramdisk in order to test that theory, which I am not feeling sufficiently motivated to do right now.
 
Yep the same exact behavior as my set-up...weird indeed. I'm working on some other stuff atm , but if there are any tests you want me to try and duplicate let me know. Have you tried booting into one of the pre-update freenas'? From my understanding the ipref client is the one doing the sending (http://openmaniak.com/iperf.php) ,so is Ubuntu sending slow, or is Freenas receiving slow?
 
Last edited:
Back
Top