You might have heard the phrase, “He’s forgotten more than you will ever know.” When dealing with somebody who is quietly competent, that is almost always the case.
I was there at the start of the Internet. I watched our campus get X.25 networking. Later, BITNET. I watched email get dumped into the UUCP queues and see magic happen as email dropped into a black hole and reappeared where it was supposed to. The magic of ARPANET, later to be The Internet.
I was part of the team that transitioned the Internet from routing tables (Host tables) into the Domain Name System. I watched as we moved from vampire taps on 10Base2 to RFC bayonet connectors. Having to explain over and over that you can’t plug the cable into your computer, you plug the cable into a T and terminate the T. The T then connects to your computer.
The magic of 10BaseT with low-cost hubs instead of expensive switches that “real” network switches cost.
Listening to the stories of Ethernet cards costing “only” 10K because they had bought so many of them.
Today I installed another new NIC into one of my nodes. This NIC cost me $33. The SFP+ module was another $15, call it $45. This gives me a MMF fiber connection, good for up to 300 meters at 10 Gigabit Per Second.
This makes three nodes connected at 10 Gbit. 1 Node at 2.5 Gbit. The rest are still at 1.0 Gbit. When I have finished upgrading the nodes, each will have a 10 Gbit NIC. They will have either MMF LC fiber connectors or 10 Gbit RJ45 copper connectors.
The only reason for the RJ45 copper is that I need to add some more SFP+ routers with extra ports.
What I Forgot
When we installed our for 100BaseT NIC’s, we did some testing to see what the throughput was and how it affected the host computer.
What we found was that the interrupt count went through the roof, bogging the computer down. At full speed, more than 75% of the CPU was dedicated to network traffic.
The cure for this was to increase the packet size. At the time, this was a big issue. Most networking devices only accepted 1500byte Ethernet Packets. If your input packet is larger than the MTU of the egress port, then the packet becomes fragmented. There are issues with IP fragments.
A newly introduced change in the specification allowed Jumbo packets. The normal size of a Jumbo packet is 9000 bytes.
Once we did the upgrade, everything got faster. We actually had network attached drives which were faster than the physically attached drives.
When setting up a VPN, you need to set the packet size going into the VPN to be smaller than the MTU of the physical network. The VPN will encapsulate packets before they are transmitted. This makes the packet larger. If you are sending a packet through the VPN with a size of 1500, and it is going on to a physical network with an MTU of 1500, every packet of 1500 bytes will be fragmented.
I have been slowly bringing up an OVN/Open vSwitch configuration. This allows a virtual machine or a container to move from host to host, maintaining the same IP address and routing path.
I’ve done a couple of live migrations now. The perceived downtime is less than 15 seconds. There were no dropped packets during the migration. Just amazing.
The OVN setup is complex because there are many options that need to be set up, and there are tools to do all of it for you. Unfortunately, the overhead of OpenStack and learning it is something I’m not ready to do. So I’m doing each step by hand.
When my virtual machines were on the same host as the egress bridge, everything worked. If the VM was on a different host within the OVN cluster, ICMP would work, but TCP would not.
Turns out that I had not set the MTU of my physical network correctly. I’ve been slowly updating the networking configuration on all of my nodes to use jumbo packets. As soon as I did that, my cross node networking traffic started working!
Happy, happy, joy, joy.
There is more testing to perform. This might also be a fix for the firewall glitch of a few weeks ago. Once I have a couple of more nodes on the OVN cluster, I can proceed with designing and testing a redundant network design, with failover.
It was a good day. Oh, I brought another 12 TB of disk online as well.
Leave a Reply