Nerd Babel

Damaged Hardware Equipment In Dustbin

Bad Hardware Design

I have had good luck with picking up discarded computers, upgrading them, and making them functional members of the computer or services farm.

A computer consists of persistent storage (disk drives and SSD), dynamic storage (memory), a processor (CPU), and I/O devices.

Data is read from disk into memory, the processor then either executes it or processes it, the results are sent to an output devices. I/O devices allow the input from disks, keyboards, persistent storage devices, networks or other devices. They also send output to video devices, networks, printers, and storage devices.

The thing that defines how a computer can be configured is the motherboard. The motherboard accepts one or more processors, one or more memory devices, one or more I/O devices.

Some motherboards come with built-in I/O devices. For example, A motherboard will come with built-in disk controllers, sound cards, video drivers, USB controllers, P/S-2 keyboard and mouse, serial drivers and many more. These are the connectors that you see on the back of your computer or elsewhere on the case.

Many of these drivers lead to a connector or a socket. If your motherboard has SATA disk controllers, there will be SATA connectors on the motherboard. If your motherboard has built-in video, the back will have an ISA video connector and/or an HDMI connector. It might have a DVI connector as well.

The covers most of what you find on the motherboard. The rest are the important sockets.

There will normally be extension slots. These are where you would plug in extra I/O devices, such as network cards, disk controllers, or video cards. There will normally be memory slots. Depending on the amount of memory supported by the CPU and motherboard, this could be two, four, eight, or even more. Finally, there is normally a socket for the CPU.

For me, I have found that the cheapest way to upgrade a computer is to give it more memory. Most software is memory intensive. If you exceed the amount of memory in your machine, your machine has to make space for the program you want to run. Then it has to read into memory, from disk, the program or its data before it can continue.

The more memory, the less “paging” needs to happen.

Upgrading the CPU is another possibility. This is normally a fairly reasonable thing to do. Consider an AMD Ryzen 7 3700, which is the CPU in one of my machines. It runs $150 on Amazon, today. I purchased it for $310 a few years ago.

Today, I can upgrade to a Ryzen 9 5950x from a Ryzen 7 3700x for $350.

Buying the latest and greatest CPU is expensive. Buying second tier, older CPUs is much more price effective.

The motherboard in this particular server is nearing its end of life. It has an AM4 socket, which has been replaced with the AM5 socket. This means it is unlike that any “new” CPUs will be released for the AM4.

Bad Design

The first place I see bad computer designs is in the actual case. This is not as bad as it used to be. It used to be that opening an HP case was sure to get you sliced up. Every edge was razor sharp.

The next major “bad design” is a case and motherboard combination which is non-standard. The only motherboard that will ever fit in that case is a motherboard from that company. Likely the only place to get such a motherboard is from E-Bay.

The next issue is when there are not enough memory slots, or worse, not enough memory addressing lines. Apple was actually famous for this.

In the old days, Apple used a 68020 class CPU. The CPU that they were using had a 32-bit address register. This is 4 Gigabytes of addressing. More than enough for the time period. Except…

Apple didn’t use all 32 bits, they only used 24 bits, leaving 8 bits unused. This gives 16 Megabytes of addressable memory. More than enough in a time period where people still remembered Billy saying “Nobody will ever need more than 640 Kilobytes of memory”.

Apple made use of the extra 8 bits in the address register for “Handles”. Not important.

Most CPUs today use a 64-bit address registers. I don’t know of a CPU that uses all 64 bits for addressing.

Which takes us to bad designs, again. Some motherboards only bring enough address lines to the memory slots to handle what is the “largest” memory card currently available. This means that you can have slots that support 16 Gigabyte DIMMs, but the motherboard only supports 4 Gigabyte DIMMs.

Often, it is worse. Cheaper motherboards will only have 2 DIMM slots. There is nothing more frustrating than having a machine with 8 GB of memory and finding out that it isn’t one 8 GB DIMM leaving room for another 8 GB, but instead two 4 GB DIMMs. Which means that when you receive that 8 GB DIMM you have 12 GB total instead of the goal of 16 GB, and you have a 4 GB DIMM that isn’t good for anything.

Sub Conclusion

If you want to be able to upgrade your computer, buy a motherboard with the latest socket design. AMD or Intel. Buy one that has enough DIMM slots to handle 4 times the amount of memory you think you are going to need. Buy a CPU that is at 1/4 to 1/3 the price of the top-tier CPU. Depending on the release date, maybe even less than that.

Make sure it has a slot for your video card AND having one PCIe-16 slot still open. You might never use it, but if you need it, you will be very frustrated at saving yourself $10.

Source of the rant

My wife is using an employer supplied laptop for her work. All of her personal work has to be done on her phone. With the kids off to university, their old HP AIO computer is available.

The only problem is that word “OLD”. A quick online search shows that I should be able to upgrade the memory from 4 GB to 16 GB and the CPU from an old Intel to an i7 CPU. This means that I can bring this shell back to life for my wife to use.

At the same time, I intend to replace a noisy fan.

Looking online, the cost of a replacement CPU will be $25. The cost of the memory, another $25. Plus $25 for a new keyboard and mouse combination. $75 for a renewed computer. Happiness exists.

Before I order anything, I boot into my Linux “rescue/install” USB thumb drive. I run lscpu and it spits out the CPU type. Which is AMD. AMD sockets do NOT support i7 CPUs. This means that my online research does not match what my software is saying. I trust the software more than the research.

Turns out that there are two versions of this particular All In One model. One is AMD-based, the other is Intel-based. The Intel-based version has a socketed CPU. The AMD version has the CPU soldered into place. It cannot be upgraded.

These maroons have rendered this machine locked in the past. With no way to upgrade the CPU, it is too slow for today’s needs. Even with maximum memory.

Conclusion

An old computer is sometimes garbage. Put it out of your misery. Use it for target practice or take it to the dump.

Cybersecurity IT engineers are working on protecting networks from cyber attacks from hackers on the Internet. Secure access to online privacy and personal data protection

Two Factor Authentication

There are two parts to access control, the first is authentication, the second is authorization.

Authentication is the process of proving you are who you claim to be.

There are three ways to prove you are who you say you are, something you know, something you have, or something about you.

When you hand your driver’s license to the police officer at a traffic stop, you are authenticating yourself. You are using two-factor authentication. The first part is that you have that particular physical license in your possession. The second is that the picture on the ID matches you.

After the officer matches you to the ID you provided, he then proceeds to authenticate the ID. Does it have all the security markings? Does the picture on the DL match the picture that his in-car computer provides to him? Does the description on the DL match the image on the card?

He will then determine if you are authorized to drive. He does this by checking with a trusted source that the ID that he holds is not suspended.

People Are Stupid

While you are brilliant, all those other people are stupid.

So consider this scenario. Somebody claims that they can read your palm and figure things out about you. Your favorite uncle on your mother’s side of the family is Bill Jones. You laugh and reply, you got that wrong, James Fillmore is my favorite uncle.

So, one of the more common security questions to recover a password is “What is your mother’s maiden name?” Do you think that the person who just guessed your favorite uncle incorrectly might do better at guessing your mother’s maiden name?

It was assumed that only you know that information. The fact is that the information is out there, it just takes a bit of digging.

The HR department at a client that I used to work for liked to announce people’s birthdays, to make them feel good.

She announced my birthday over the group chat. I went into her office and explained that she had just violated my privacy.

The next time you are at the doctor’s office, consider what they use to authenticate you. “What is your name and date of birth?”

I lie every time some website asks for my date of birth, unless it is required for official reasons.

Finally, people like to pick PINs and codes that they can remember. And they use things that match what they remember. What is a four-digit number that is easy for most people to remember? The year of their birth.

You do not want to know how many people use their year of birth for their ATM PIN.

In addition, it is easy to fool people into giving you their password. We call that phishing today. But it is the case that many people will read that their account has been compromised and rush to fix it. Often by clicking on the link in the provided e-mail.

A few years back, I was dealing with a creditor. They have a requirement to not give out information. A blind call asking me to authenticate myself to them. I refused. I made them give me the name of their company as well as their extension and employ number.

I then looked up the company on the web. Verified that the site had been in existence for multiple years. Verified with multiple sources what their main number was. Then called the main number and asked to be connected to the representative.

Did this properly authenticate her? Not really, but it did allow us to move forward until we had cross authenticated each other.

Biometrics

If you have watched NCIS, they have a magic gizmo on the outside of the secure room. To gain access, the cop looks into the retina scanner. The scanner verifies that pattern it scans with what is on record and, if you are authorized, unlocks the door.

Older shows and movies used palm scanners or fingerprint scanners. The number of movies in which the MacGuffin is the somebody taking a body part or a person to by-pass biometric scanners is in the 1000s, if not higher.

So let’s say that you are using a biometric to unlock your phone. Be it a face scan or a fingerprint scan.

The bad guys (or the cops) have you and your phone. While they cannot force you to give up your password, they can certainly hold the phone up to your face to unlock it. Or forcibly use your finger to unlock it.

Biometrics are not at the point where I would trust them. Certainly, not cheap biometric scanners.

It Doesn’t Look Good

We need to protect people from themselves. We can’t trust biometrics. That leaves “something they have”.

When you go to open unlock your car, you might use a key fob. Press the button and the car unlocks. That is something you have, and it is what is used to authenticate you. Your car knows that when you authenticate with your key fob, you are authorized to request that the doors be unlocked.

If you are old school, and still use a physical key to unlock your home, the lock in your door uses an inverse pattern to authenticate the key that you possess. It knows that anybody who has that key is authorized to unlock the door.

Since people might bypass the lock or make an unauthorized duplicate of your key, you might add two-factor authentication. Not only do they have to have something in their possession, they must all know the secret code for the alarm.

Two-Factor Authentication

Two-Factor authentication is about providing you with something that only you possess. You need to be able to prove that you have control of that object and that the answer cannot be replayed.

Consider you are coming back from patrol. You reach the gate and the sentry calls out “thunder”. You are supposed to reply with “dance”. You have now authenticated and can proceed.

The bad guy now walks up. The sentry calls out “thunder”. The bad guy repeats what you said, “dance”. And the bad guy now walks through the gate.

This is a “replay” attack. Any time a bad guy can repeat back something that intercepted to gain authentication, you have a feeble authentication.

The first authenticator that I used was a chip on a card. It was the size of a credit card, you were expected to carry it with you. When you tried to log in, you were prompted for a number from the card. The card had a numeric keypad. You input your PIN. The card printed a number. That number was only good for a short time.

You entered that number as your password, and you were authenticated.

There were no magic radios. Bluetooth didn’t exist. Wi-Fi was still years in the future. And it worked even if you were 100s of miles away, logging in over a telnet session or a dial-up modem.

How?

Each card had a unique serial number and a very accurate clock. The time of day was combined with the serial number and your pin to create a number. The computer also knew the time, accurately. When you provided the number, it could run a magic algorithm and verify that the number came from the card with that serial number.

One of the keys to computer security is that we don’t store keys in a recoverable format. Instead, we store cryptographic hashes of your password. We apply the same hash to the password/pass phrase you provided us and then compare that to the stored hash. If they match, the password is correct. There is no known methods for going from the hash to the plaintext password.

That security card had some other features. It could be programmed to have a self-destruct PIN, or an alert PIN, or a self-destruct after too many PIN entries in a given amount of time.

When it self-destructed, it just changed an internal number, so the numbers generated would never again be correct. If the alert PIN was set up, using the generated number would inform the computer that the PIN was given under duress. The security policies would determine what happened next.

Today, we started to see simple two-factor authentication. “We sent a text to your phone, enter the number you received.” “We emailed the account on record, read and click on the link.”

These depend on you having control of your email account or your phone. And that nobody is capable of intercepting the SMS text.

A slightly more sophisticated method is a push alert to an app on your phone. This method requires radio communications with your phone app. The site requesting you to authenticate transmits a code to your phone app. Your phone app then gives you a code to give to the site. Thus, authenticating you.

There are other pieces of magic involved in these. It isn’t a simple number, there is a bunch of math/cryptology involved.

Another method is using your phone to replace the card described above.

I authenticate to my phone to prove I’m authorized to run the authenticator application. There is a 6-digit number I have to transcribe to the website within 10 seconds. After 10 seconds, a new number appears.

I’ve not looked into all the options available, it just works.

The cool thing about that authenticator, is that it works, even if all the radios in my phone are off.

Finally, there are security keys. This is what I prefer.

I need to put the key into the USB port. The key and the website exchange information. I press the button on the security key, and I’m authenticated.

Another version requires me to type a passphrase to unlock the key before it will authenticate to the remote site.

Conclusion

If you have an option, set up two-factor authentication. Be it an authenticator app on your phone or a Yubico security key. It will help protect you from stupids.

Cybersecurity IT engineers are working on protecting networks from cyber attacks from hackers on the Internet. Secure access to online privacy and personal data protection

Data Security

Data security is the protection of your data throughout its lifecycle.

Let’s pretend you have a naughty image of yourself that you don’t want anybody else to see.

The most secure way of protecting that image is to have never taken that image in the first place. It is too late now.

If you put that image on a portable USB drive, then somebody can walk off with that USB drive. The protection on that image is only as good as the physical security of that device.

Dick, the kiddy diddler, who is in the special prison for the rest of his life, kept his kiddy porn on USB thumb drives. They were stored around his bed. Once the cops served their warrant, all of those USB drives were available to be examined.

They were examined. Dick was evil and stupid.

The next best way is to encrypt the image using a good encryption tool.

To put this in perspective, the old Unix crypt program implemented an improved version of the German Enigma machine. It was improved because it could encrypt/decrypt a 256 character alphabet rather than the original 27 characters.

Using the crypt breakers workbench, a novice can crack a document encrypted with the Unix crypt command in about 30 minutes.

At the time, crypt was the only “good” encryption available at the command line. The only other was a rot-13 style obfuscation tool.

In our modern times, we have access to real cryptography. Some of it superb. We will consider using AES-256, the American Encryption Standard. This is currently considered secure into the 2050s at current compute power increases.

AES-256 uses a 256-bit key. You are not going to remember a 256-bit number. That is a hex number 64 characters long. So you use something like PGP/GnuPG. PGP stands for Pretty Good Privacy.

In its simplest form, you provide a passphrase to the tool, and it converts that into a 256-bit number, which is used to encrypt the file. Now make sure you don’t forget the pass phrase and also that you delete (for real) the original image.

Now, if you want to view that image, why I don’t know, you have to reverse the process. You will again have the decrypted file on your disk while you examine the image. Don’t forget to remove it when you are done looking.

We can take this to a different level, by using the public key capabilities of PGP. In this process, you generate two large, nearly prime, numbers. These numbers, with some manipulation, are used to encrypt keys. These are manipulated into a Public Key and a Private Key. The public key can decrypt files encrypted with the private key. The private key can decrypt files encrypted with the public key.

The computer now uses a good random number generator to create a 256-bit key. That key is used to encrypt your plaintext file. The key is then encrypted with your “Public Key” and attached to the file.

Now you can decrypt the file using your “Private Key”.

This means that your private key is now the most valuable thing. So you encrypt that with a pass phrase.

Now you need to provide the pass phrase to the PGP program to enable it to decrypt your private key, which you can then use to decrypt or encrypt files. All great stuff.

I went a step further. My PGP key requires a security fob to decrypt. This means it requires something I know, a pass phrase, plus something I have, the security fob.

This means that there are two valuable items you have, the private key and your pass phrase. Let’s just say that those need both physical and mental protection. You need to make sure that nobody can see you type in your pass phrase, plus your pass phrase has to be something you can remember, plus it has to be long enough that your fingers can’t be read as you type it.

And, don’t ever type it on a wireless keyboard. You would have to trust that nobody is intercepting the transmission from the keyboard to the computer system.

In addition to that, most keyboards are electronically noisy. This means that the electrical interference that is given off by your keyboard can be read and used to guess at key sequences.

Finally, you need to make sure that nobody has installed a keylogger to capture every key you type. These can go inside your keyboard, or just plug into the end of your USB cable.

All of this is painful to do. And you need to go through the decryption phase every time you want to look at your secret document.

So we can use disk encryption.

The idea here is similar to PGP. You generate a large block of random bits. This will be your encryption/decryption key. This block of random bits is then encrypted with a pass phrase. When you mount your disk drive, you need to “unlock” the decryption key. Once that is done, the data on that disk is accessible in plain format.

You can tell your computer to forget the key and then none of the data is available. You can unmount the file system and the data is protected. You can turn off your computer and the data is now unavailable and protected.

Of course, they might have your pass phrase, in which case they will just use it to decrypt your key.

But there is a neat thing that you can do, you can wipe the decryption key. If this is done, then even with your pass phrase, there is nothing that can be done.

The government has strict requirements on how to erase magnetic media, disk drives, magnetic tapes, and the like. For magnetic tape, they use a machine that has a strong magnetic field. This field will scramble any data on the tape if used correctly.

This is not good enough for disk drives, though. The “short” version of erasing a magnetic disk is to write all zeros, then write all ones, then write random numbers. This will make it difficult to recover the data. The longer version, “Gutman”, requires 35 passes.

Sounds good, let’s do it on a test drive. Here is a 12 TB drive that is 75% full. The 75% doesn’t help us. We still need to erase every sector.

Our SATA 3, 6 Gbit I/O channel is not our bottleneck, it is the time to write the data. That is 210 Mbit/second. So more than five days, per pass.

If we have encrypted the drive, we only have to wipe a few sectors. That can be done in far less than a second.

But, it gets better. You can buy “secure” drives. These drives have the encryption built in. You send a magic command to the drive, and it wipes its key and makes the entire disk just random bits, nearly instantly.

This key on disk method is what Ceph uses, under the hood.

Of course, that is only part of the solution, the next part is on the wire encryption. This requires still more.

Conclusion

The biggest issue facing people who are trying to create secure environments is that they need to make sure that they have identified who the black hat is.

  • Will they be able to physically access your equipment? Assume yes.
  • Will they be able to tap into your network? Assume yes.
  • Will they be able to physically compromise your keyboard? Maybe?
  • Will they be able to take your stuff?
  • Will they be able to force you to give your pass phrase?
  • Will they be able to access your computer without a password?
  • Will you be able to boot your network from total outage without having to visit each node?
chaotic mess of network cables all tangled together

Network Nerding

You might have heard the phrase, “He’s forgotten more than you will ever know.” When dealing with somebody who is quietly competent, that is almost always the case.

I was there at the start of the Internet. I watched our campus get X.25 networking. Later, BITNET. I watched email get dumped into the UUCP queues and see magic happen as email dropped into a black hole and reappeared where it was supposed to. The magic of ARPANET, later to be The Internet.

I was part of the team that transitioned the Internet from routing tables (Host tables) into the Domain Name System. I watched as we moved from vampire taps on 10Base2 to RFC bayonet connectors. Having to explain over and over that you can’t plug the cable into your computer, you plug the cable into a T and terminate the T. The T then connects to your computer.

The magic of 10BaseT with low-cost hubs instead of expensive switches that “real” network switches cost.

Listening to the stories of Ethernet cards costing “only” 10K because they had bought so many of them.

Today I installed another new NIC into one of my nodes. This NIC cost me $33. The SFP+ module was another $15, call it $45. This gives me a MMF fiber connection, good for up to 300 meters at 10 Gigabit Per Second.

This makes three nodes connected at 10 Gbit. 1 Node at 2.5 Gbit. The rest are still at 1.0 Gbit. When I have finished upgrading the nodes, each will have a 10 Gbit NIC. They will have either MMF LC fiber connectors or 10 Gbit RJ45 copper connectors.

The only reason for the RJ45 copper is that I need to add some more SFP+ routers with extra ports.

What I Forgot

When we installed our for 100BaseT NIC’s, we did some testing to see what the throughput was and how it affected the host computer.

What we found was that the interrupt count went through the roof, bogging the computer down. At full speed, more than 75% of the CPU was dedicated to network traffic.

The cure for this was to increase the packet size. At the time, this was a big issue. Most networking devices only accepted 1500byte Ethernet Packets. If your input packet is larger than the MTU of the egress port, then the packet becomes fragmented. There are issues with IP fragments.

A newly introduced change in the specification allowed Jumbo packets. The normal size of a Jumbo packet is 9000 bytes.

Once we did the upgrade, everything got faster. We actually had network attached drives which were faster than the physically attached drives.

When setting up a VPN, you need to set the packet size going into the VPN to be smaller than the MTU of the physical network. The VPN will encapsulate packets before they are transmitted. This makes the packet larger. If you are sending a packet through the VPN with a size of 1500, and it is going on to a physical network with an MTU of 1500, every packet of 1500 bytes will be fragmented.

I have been slowly bringing up an OVN/Open vSwitch configuration. This allows a virtual machine or a container to move from host to host, maintaining the same IP address and routing path.

I’ve done a couple of live migrations now. The perceived downtime is less than 15 seconds. There were no dropped packets during the migration. Just amazing.

The OVN setup is complex because there are many options that need to be set up, and there are tools to do all of it for you. Unfortunately, the overhead of OpenStack and learning it is something I’m not ready to do. So I’m doing each step by hand.

When my virtual machines were on the same host as the egress bridge, everything worked. If the VM was on a different host within the OVN cluster, ICMP would work, but TCP would not.

Turns out that I had not set the MTU of my physical network correctly. I’ve been slowly updating the networking configuration on all of my nodes to use jumbo packets. As soon as I did that, my cross node networking traffic started working!

Happy, happy, joy, joy.

There is more testing to perform. This might also be a fix for the firewall glitch of a few weeks ago. Once I have a couple of more nodes on the OVN cluster, I can proceed with designing and testing a redundant network design, with failover.

It was a good day. Oh, I brought another 12 TB of disk online as well.

wooden cubes with words from the computer, software, internet categorie . This image belongs to the series cube with computer, software, internet words. The series consists of frequently used words in the categorie computer, software, internet

WYSIAYG vs WYSIWYG

I started my computer career with the command line, or as it is known today, the CLI.

Almost everything I do is done via CLI.

I’ve had clients that had hosts in China, Ukraine, and London. They all look the same to me because they are just another window next to the other windows on my desktop.

When programming, my hands seldom leave the keyboard. I don’t need to use the mouse to program. It is mostly done with control key sequences.

When I need to configure something, I use a text editor and edit the configuration file. If the configuration file is JSON, I use JSON CLI tools to manipulate the file. Everything is done via the command line.

Even this post is done from “the command line.” I am typing raw HTML into a simple text editor. So an aside is written as:

<div class="aside">This is the aside</div>

Which renders as

This is the aside

The editor also has a visual editor. What we call a “What You See Is What You Get” or WYSIWYG.

In the WYSIWYG, you type, and it is formatted close to what it will look like when presented in a web browser.

You have likely used a word processor like Microsoft Word, Apple’s old Mac Write, or the modern LibreOffice. If you’ve used Google Docs, you have used the online version of LibreOffice.

The idea is that you can look at what you type in these WYSIWYG editors and that is what it will look like when printed.

We have another term for Graphical User Interfaces, WYSIAYG, or What You See Is All You Get.

What do I mean by that? Well, if you have a GUI that performs configuration options, then only the options that are implemented in the GUI are available to you through the GUI.

The new level 3 managed switch has a web GUI. It is rather nice. You can see the status of each port. There are pleasant drop-downs to give you choices.

One of the issues I needed to deal with was to get DHCP running on it, rather than the old DHCP server.

After fumble fingering my way through the interface, I had a working configuration.

The other day, I wanted to set up network booting. I am installing Linux on so many machines, both virtual and bare-metal, that I wanted a simple method to use. Network booting seemed like the answer.

This requires setting the “next-server” and “bootfile” options in the DHCP configuration file.

There is NO place in the web GUI to do so. It is available through the CLI. Undocumented, of course.

WYSIAYG. I muddled through, I got it working. I can now do a network install anytime I want. And I can provide multiple options.

Which leads me to the root cause of this rant.

They are now building CLI tools that require CLI tools to configure them. And the CLI tools that do the configuration are not well documented because you should use the CLI management tool!

I needed to install incus on a system to configure a working OVN network! I am so frustrated right now that I could scream.

Filler

I’m exhausted. I’ve been pulling fiber for the last two days. All part of an infrastructure upgrade.

Normally, pulling cable in a modern datacenter is pretty easy. This is not a modern datacenter.

The original cable runs were CAT6 with RJ45 connectors. When the cables were installed, the installation had to be nondestructive. No holes in walls, no holes in floors. Hide the cables as best you can.

One of the cables we removed was to a defunct workstation. It had been run across the floor and then covered with a protective layer to keep it from getting cut or snagged. The outer insulation had been ripped away. There was bare copper showing. Fortunately, that particular workstation hasn’t been in place for a few years.

The backbone switch was mounted in the basement. Not a real issue. The people who pulled some of the last cable didn’t bother to put in any cable hangers. So it had loops just dangling.

There were drops that could not be identified. Those are now disconnected, but nobody complained, so nothing was taken offline.

I’ve found a new favorite cable organizer.

Cable Management Wire Organizer

These are reusable. They open fully and will hold many cat6 and even more fiber. They have the 3M foam double-sided tape on them. This works great against smooth, clean surfaces.

The place where they shine is that they also have a hole designed for a #6 screw. In places where there were no smooth surfaces, much less clean surfaces. The sticky held them in place long enough to drive a screw.

There are no more dangling cables.

My only hope is that there are no more configuration issues with the new switch. *caugh*DHCP*caugh*

Networking, interrelationships

Part of the task of making a High Availability system is to make sure there is no single point of failure.

To this end, everything is supposed to be redundant.

So let’s take the office infrastructure as a starting point. We need to have multiple compute nodes and multiple data storage systems.

Every compute node needs access to the same data storage as all the other compute nodes.

We start with a small Ceph storage cluster. There are currently a total of 5 nodes in three different rooms on three different switches. Unfortunately, they are not split out evenly. We should have 9 nodes, 3 in each room.

Each of the nodes currently breaks out as 15 TB, 8 TB, 24 TB, 11 TB, and 11 TB. There are two more nodes ready to go into production, each with 11 TB of storage.

It is currently possible to power off any of the storage nodes without effecting the storage cluster. Having more nodes would make the system more redundant.

Unfortunately, today, an entire room went down. What was the failure mode?

DHCP didn’t work. All the nodes in room-3 were moved to a new 10Gbit switch. Actual 4×2.5 2×10. The four 2.5Gbit were used to connect three nodes and one access point. One of the 10Gbit SFP+ ports was used as an uplink to the main switch.

When the DHCP leases expired, all four machines lost their IP addresses. This did not cause me to loss a network connection to them because they had static addresses on a VLAN.

What did happen is they lost the ability to talk to the LDAP server on the primary network. Because they had lost that primary network connection, no LDAP, no ability to log in.

The first order of repair was to reboot the primary router. This router serves as our DHCP server. This did not fix the issue.

Next I power cycled the three nodes. This did not fix the issue.

Next I replaced the switch with the old 1Gbit switch (4x1Gbit, 4x1Gbit with PoE). This brought everything back to life.

My current best guess is that the cat6 cable from room 3 to the main switch is questionable. The strain relief is absent and it feels floppy.

More equipment shows up soon. I’ll be pulling my first fiber in 25 years. The new switch will replace the current main switch. This is temporary.

There will be three small switches for each room. Then there will be a larger switch to replace the current main switch. The main switch will be linked with 10Gbit fiber to the 3 rooms in server rooms. The other long cables will continue to use copper.

Still, a lesson in testing.

The final configuration will be a 10Gbit backbone with OM4 fiber, the nodes will be upgraded to have 10Gbit NICs which will attach to the room switches via DAC cables. There will then be a 2.5Gbit copper network. The copper network will the default network used by devices.

The 10Gbit network will be for Ceph and Swarm traffic.

I’m looking foward to having this all done.

Server room data center with rows of server racks. 3d illustration

Docker Swarm?

There is this interesting point where you realize that you own a data center.

My data center doesn’t look like that beautiful server farm in the picture, but I do have one.

I have multiple servers, each with reasonable amounts of memory. I have independent nodes, capable of performing as ceph nodes and as docker nodes.

Which took me to a step up from K8S.
Read More

Server room data center with rows of server racks. 3d illustration

High Availability Services

People get very upset when they go to visit Amazon, Netflix, or just their favorite gun blog and the site is down.

This happens when a site is not configured with high availability in mind.

The gist is that we do not want to have a single point of failure, anywhere in the system.

To take a simple example, you have purchased a full network connection to your local office. This means that there is no shared IP address. You have a full /24 (255) IP addresses to work with.

This means that there is a wire that comes into your office from your provider. This attaches to a router. The router attaches to a switch. Servers connect to the server room switch which connects to the office switch.

All good.

You are running a Windows Server on bare metal with a 3 TB drive.

Now we start to analyze failure points. What if that cable is cut?

This happened to a military installation in the 90s. They had two cables coming to the site. There was one from the south gate and another from the north gate. If one cable was cut, all the traffic could be carried by the other cable.

This was great, except that somebody wasn’t thinking when they ran the last 50 feet into the building. They ran both cables through the same conduit. And when there was some street work a year or so later, the conduit was cut, severing both cables.

The site went down.

Read More

Closeup hands try to solve the confused ropes on white background, psychotherapy, mental complex

For Lack of (nerd post)

Oh what a tangled web we weave when first we practice to deceivebe a system admin

I’ve been deep into a learning curve for the last couple of months, broken by required trips to see dad before he passes.

The issue at hand is that I need to reduce our infrastructure costs. They are out of hand.

My original thought, a couple of years ago, was to move to K8S. With K8S, I would be able to deploy sites and supporting architecture with ease. One control file to rule them all.

This mostly works. I have a Helm deployment for each of the standard types of sites I deploy. Which works well for me.

The problem is how people build containers.

My old method of building out a system was to create a configuration file for an HTTP/HTTPS server that then served individual websites. I would put this on a stable OS. We would then do a major OS upgrade every four years on an OS that had a 6-year support tail for LTS releases. (Long-Term Support)

This doesn’t work for the new class of developers and software deployments.

Containers are the current answer to all our infrastructure ills.

Read More