Myri-10G
10-Gigabit Ethernet
Performance Measurements

We report netperf, ntttcps and ntttcpr (from the Windows 2003 DDK), or iperf performance measurements using 9000-byte (jumbo) frames and 1500-byte (standard) frames. Bandwidth (BW) is measured in Megabits/second. For the Linux and FreeBSD tests, we used Intel quad-core dual-processor 2.66GHz Xeon X5355s with the Supermicro X7DB8 motherboard. For the Windows tests, the sender is an AMD single-core dual-processor 2.6GHz Opteron with the Tyan S2895 motherboard, and the receiver is a Dell PowerEdge 2950 machine. For the MacOSX tests, we used Intel dual-core dual-processor 2.6GHz Xeons with the MacPro. For the Solaris tests, we used a pair of Dell PowerEdge 2950 machines with Intel dual-core dual-processor 3.0GHz Xeons.

Linux | Windows | Solaris GLDv2 | Solaris GLDv3 | MacOSX | FreeBSD

Linux

The Linux tests were run using netperf version 2.4.3 with the Ubuntu 7.04 x86_64 2.6.20-16-server kernel and the Myri10GE version 1.4.2 driver. The Intel I/OAT driver (ioatdma-2.15) was loaded, and the Myri10GE driver was configured to support Intel DCA. I/OAT DMA copy offload was disabled, TCP buffer sizes were increased and TCP timestamps were disabled as recommended in the Performance Tuning section of the Linux Myri10GE README. TCP Segmentation Offload (TSO) and Large Receive Offload (LRO) were enabled, and the default interrupt coalescing setting of 75µs was used. The netserver was run without options. The two hosts were connected without a switch (point-to-point).

Netperf Results, MTU 9000

Commands:
     $ netperf -H dust01-m -t TCP_STREAM -C -c -l 60
     $ netperf -H dust01-m -t TCP_SENDFILE -l 60 -C -c -F /boot/vmlinux
     $ netperf -H dust01-m -t UDP_STREAM -l 60 -C -c -- -m 8972 -s 1M -S 1M
  
Results:
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     9000   9910.26  11.32       5.89
     TCP_SENDFILE   9000   9894.46   3.37	5.91
     UDP_STREAM_TX  9000   9865.90  12.50	0.00
     UDP_STREAM_RX  9000   9865.90   0.00	7.02
  

Netperf Results, MTU 1500

Commands:
     $ netperf -H dust01-m -t TCP_STREAM -C -c -l 60
     $ netperf -H dust01-m -t TCP_SENDFILE -l 60 -C -c -F /boot/vmlinux
     $ netperf -H dust01-m -t UDP_STREAM -l 60 -C -c -- -m 1472 -s 1M -S 1M
  
Results:
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     1500   9463.06  10.54       8.71
     TCP_SENDFILE   1500   9354.66   2.75       8.67
     UDP_STREAM_TX  1500   6052.60  15.19       0.00
     UDP_STREAM_RX  1500   6052.60   0.00       9.96
  

Notes:


Windows

The Windows tests were done with the Myri10GE AMD64 1.0.1 driver and Windows Server 2003 x64 SP1 Edition. TCP Segmentation Offload (TSO) was enabled, checksum offload was enabled, an interrupt coalescing delay of 25µs was used, and flow control was enabled. No registry entries were added to the Windows 2003-based machines.

One ntttcps process was run on one Windows host connected to one Windows host running one ntttcpr process. These two hosts were connected without a switch (point-to-point).

Ntttcp Results, MTU 9000

Commands:
    Sender: ntttcps -m 1,1,10.0.130.50 -l 1048576 -n 100000 -w -v -a 8
    Receiver: ntttcpr -m 1,1,10.0.130.50 -l 1048576 -rb 2097152 -n 1000000 -w -v -a 8
Results on the Sender:
 
-----------------------------------------------------------------
|     Estimated Time to Complete Test at line speed (seconds)   |
-----------------------------------------------------------------

1000 Base-T  622 OC-12(ATM)  155 OC-3(ATM)  100 Base-T  10 Base-T
===========  ==============  =============  ==========  =========

        419             369           1408        2128       25000



------------------------------------------------------
|                   Output Summary                   |
------------------------------------------------------

Thread Realtime(s) Throughput(KB/s) Throughput(Mbit/s)
====== =========== ================ ==================

     0      85.500        1226404.678           9811.237


Total Bytes(MEG) Realtime(s) Average Frame Size Total Throughput(Mbit/s)
================ =========== ================== ========================

   104857.600000      85.500          60667.263                 9811.237


Total Buffers Throughput(Buffers/s) Pkts(sent/intr) Intr(count/s) Cycles/Byte
============= ===================== =============== ============= ===========

   100000.000              1169.591               1      23467.10         0.5


Packets Sent Packets Received Total Retransmits Total Errors Avg. CPU %
============ ================ ================= ============ ==========

     1728405           281845                 2            0      10.70
  
Results on the Receiver:
 
-----------------------------------------------------------------
|     Estimated Time to Complete Test at line speed (seconds)   |
-----------------------------------------------------------------

1000 Base-T  622 OC-12(ATM)  155 OC-3(ATM)  100 Base-T  10 Base-T
===========  ==============  =============  ==========  =========

        419             369           1408        2128       25000



------------------------------------------------------
|                   Output Summary                   |
------------------------------------------------------

Thread Realtime(s) Throughput(KB/s) Throughput(Mbit/s)
====== =========== ================ ==================

     0      85.735        1223043.098           9784.345


Total Bytes(MEG) Realtime(s) Average Frame Size Total Throughput(Mbit/s)
================ =========== ================== ========================

   104857.600000      85.735           8959.587                 9784.345


Total Buffers Throughput(Buffers/s) Pkts(recv/intr) Intr(count/s) Cycles/Byte
============= ===================== =============== ============= ===========

   100000.000              1166.385              29       4610.68         2.7


Packets Sent Packets Received Total Retransmits Total Errors Avg. CPU %
============ ================ ================= ============ ==========

      281837         11703396                 0            0      27.27
  

Notes:


Solaris GLDv2

The Solaris GLDv2 tests were done using netperf version 2.4.3 with the Myri10GE AMD64 1.1.3 driver with Solaris 10 5/08 (S10 U5). Solaris's GLDv2 driver ABI does not support TCP Segmentation Offload (TSO). LRO was enabled. The default interrupt coalescing setting of 30µs was used. The netserver was run without options. The two hosts were connected without a switch (point-to-point).

Netperf Results, MTU 9000

Commands:
     $ netperf -H dell2950a-m -t TCP_STREAM -C -c -l 60  -T loc,remote -- -s 1M -S 1M 
     $ netperf -H dell2950a-m -t TCP_SENDFILE -F/var/tmp/scratch -C -c -l 60  -T loc,remote -- -s 1M -S 1M 
     $ netperf -H dell2950a-m -t UDP_STREAM -l 60 -C -c  -T loc,remote -- -m 8972 -s 1M -S 1M 

  
Results:
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     9000   9844.57  28.59      22.74
     TCP_SENDFILE   9000   9550.23  29.78      22.14
     UDP_STREAM_TX  9000   8610.00  26.88      00.00
     UDP_STREAM_RX  9000   8610.00  00.00      32.35
  

Netperf Results, MTU 1500

Commands:
     $ netperf -H dell2950a-m -t TCP_STREAM -C -c -l 60  -T loc,remote -- -s 1M -S 1M 
     $ netperf -H dell2950a-m -t TCP_SENDFILE -F/var/tmp/scratch -C -c -l 60  -T loc,remote -- -s 1M -S 1M 
     $ netperf -H dell2950a-m -t UDP_STREAM -l 60 -C -c  -T loc,remote -- -m 1472 -s 1M -S 1M 

  
Results:
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     1500   5477.59  33.97      35.46
     TCP_SENDFILE   1500   4761.17  41.52      31.34
     UDP_STREAM_TX  1500   4222.20  30.31      00.00
     UDP_STREAM_RX  1500   4222.20  00.00      45.37
  

Notes:


Solaris GLDv3

The Solaris GLDv3 tests were done using netperf version 2.4.3 with the Myri10GE AMD64 0.0.1gldv3 driver with Solaris 10 5/08 (S10 U5). TCP Segmentation Offload (TSO) and Large Receive Offload (LRO) were enabled, and the default interrupt coalescing setting of 125µs was used. The netserver was run without options. The two hosts were connected without a switch (point-to-point).

Netperf Results, MTU 9000

Commands:
     $ netperf -H dell2950a-m -t TCP_STREAM -C -c -l 60 -T loc,remote -- -s 1M -S 1M 
     $ netperf -H dell2950a-m -t TCP_SENDFILE -F/var/tmp/scratch -C -c -l 60  -T loc,remote -- -s 1M -S 1M 
     $ netperf -H dell2950a-m -t UDP_STREAM -l 60 -C -c  -T loc,remote -- -m 8972 -s 1M -S 1M 

  
Results:
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     9000   9630.59  16.44      21.41
     TCP_SENDFILE   9000   9427.31  26.36      21.13
     UDP_STREAM_TX  9000   8501.60  26.57      00.00
     UDP_STREAM_RX  9000   8501.60  00.00      26.60
  

Netperf Results, MTU 1500

Commands:
     $ netperf -H dell2950a-m -t TCP_STREAM -C -c -l 60  -T loc,remote -- -s 1M -S 1M 
     $ netperf -H dell2950a-m -t TCP_SENDFILE -F/var/tmp/scratch -C -c -l 60  -T loc,remote -- -s 1M -S 1M 
     $ netperf -H dell2950a-m -t UDP_STREAM -l 60 -C -c  -T loc,remote -- -m 1472 -s 1M -S 1M 

  
Results:
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     1500   9025.92  16.94      43.39
     TCP_SENDFILE   1500   8828.88  31.15      43.10
     UDP_STREAM_TX  1500   4271.60  29.80      00.00
     UDP_STREAM_RX  1500   4271.60  00.00      41.64
  

Notes:


MacOSX

The MacOSX tests were run using netperf version 2.4.3 and iperf version 2.0.2 with MacOSX 10.5 and the Myri10GE version 1.1.0 driver. MacOSX does not support TCP Segmentation Offload (TSO). LRO was enabled as recommended in the Performance Tuning section of the MacOSX Myri10GE README. The default interrupt coalescing setting of 75µs was used. The netserver was run without options. The iperf server was run with the same window (-w) and buffer length (-l) arguments as the client. The two hosts were connected without a switch (point-to-point).

Netperf Results, MTU 9000

Commands:
     $ netperf -H macpro01-m -t TCP_STREAM -C -c -l 60 -- -S 768K -S 768K -m 256K
     $ netperf -H macpro01-m -t UDP_STREAM -l 60 -C -c -- -m 32K -s 512K -S512K
     $ iperf -c macpro01-m -w  -w 768k -l 256k -P 2 -f m -t 60
  
Results:
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     9000   9661.82  41.38      36.74
     UDP_STREAM_TX  9000   6867.00  28.08      00.00
     UDP_STREAM_RX  9000   6867.00  00.00      39.26
  
Dual-Stream TCP Results (2 netperf processes):
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     9000   9692.00  54.72      47.36
  
Dual-Stream TCP Results (2 iperf threads):
      Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     iperf          9000   9825.00  65         58         
  

Netperf Results, MTU 1500

Commands:
     $ netperf -H macpro01-m -t TCP_STREAM -C -c -l 60 -- -s 768K -S 768K -m 256K
     $ netperf -H macpro01-m -t UDP_STREAM -l 60 -C -c -- -m 32K -s 512K -S512K
     $ iperf -c macpro01-m -w 512k -l 256k -P 2 -f m -t 60
  
Single-Stream Results:
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     1500   4782.15  41.70      39.15
     UDP_STREAM_TX  1500   3310.40  27.85      00.00
     UDP_STREAM_RX  1500   3310.40  00.00      39.24
  
Dual-Stream TCP Results (2 netperf processes):
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     1500   4367.00  42.29      43.75
  
Dual-Stream TCP Results (2 iperf threads):
      Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     iperf          1500   6417.00  76         65
  

Notes:


FreeBSD

The FreeBSD tests were run using netperf version 2.4.3 with with the if_mxge driver found in FreeBSD 7.0-current. We used FreeBSD/amd64 built from sources dated June 11th, 2007. The kernel was configured without the WITNESS or INVARIENTS options, and with SCHED_ULE scheduler. The kern.ipc.maxsockbuf tunable was increased to 8388608. TCP Segmentation Offload (TSO) and Large Receive Offload (LRO) were enabled, and the default interrupt coalescing setting of 30µs was used. The netserver was run without options. The two hosts were connected without a switch (point-to-point).

Netperf Results, MTU 9000

Commands:
     $ netperf -H dust01-m -t TCP_STREAM -C -c -l 60 -- -S768K -s768K
     $ netperf -H dust01-m -t TCP_SENDFILE -l 60 -C -c -F /boot/kernel/kernel -- -S768K -s768K
     $ netperf -H dust01-m -t UDP_STREAM -l 60 -C -c -- -m 8972 -s 512K -S 512K
  
Results:
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     9000   9894.99  13.59       10.95
     TCP_SENDFILE   9000   9892.65   5.66       11.04
     UDP_STREAM_TX  9000   9924.70  15.28	0.00
     UDP_STREAM_RX  9000   9924.70   0.00	9.49
  

Netperf Results, MTU 1500

Commands:
     $ netperf -H dust01-m -t TCP_STREAM -C -c -l 60 -- -S768K -s768K
     $ netperf -H dust01-m -t TCP_SENDFILE -l 60 -C -c -F /boot/kernel/kernel -- -S768K -s768K
     $ netperf -H dust01-m -t UDP_STREAM -l 60 -C -c -- -m 16K -s 512K -S 512K
  
Results:
     Netperf Test   MTU    BW       TX_CPU %   RX_CPU %
     ------------   ----   -------  --------   --------
     TCP_STREAM     1500   9305.14   9.60      17.21
     TCP_SENDFILE   1500   9355.76   4.95      17.34
     UDP_STREAM_TX  1500   5557.00  14.21       0.00
     UDP_STREAM_RX  1500   5553.10   0.00      11.44
  

Notes:


Myricom banner
Last updated: 08 August 2008