# It’s the little things: Why 4Gbit fiber channel = 400MB/sec

I’m trying different things nowadays to see if I can’t be informative *and* fun! Well, at least informative. And maybe blog more regularly, too.

We’ll start it off with something that had been bugging me for a couple of years, for which I just found the explanation.

Computer work engenders some bizarre practices and tenets, but typically one could always rely upon the math. Since computer science departments grew out of math departments at universities, there usually isn’t a lot of wiggle room when it comes to the numbers — except for when hard drive manufacturers let their marketing departments call a 645GB hard drive a ‘750GB hard drive’. But other than the boil-brained, you could pretty much count on the fact that there are 8 bits in a byte, and that a 1 gigabit network connection would (theoretically) let you pass 1024 / 8 = 128 megabytes per second. The numbers were never perfect metric matches (like 1 meter = 100 centimeters), but they were consistent.

So it surprised me to find everyone who talked about fiber channel connection speeds saying that a 4 gigabit fiber channel connection can push 400 megabytes of data, a 2 gigabit connection 200 megabytes of data, and so on. “They must be taking a shortcut,” I thought, and didn’t give it much thought beyond some wondering as to why 4 gigabit fiber channel doesn’t pass the same amount of data that, say, theoretical 4 gigabit ethernet would.

It turns out I was partially dread-bolted clack-dish wrong, though, and discovered recently that there’s a good reason why fiber channel went metric. It’s still not as exact as I’d like, but the approximations were a bit more accurate than I’d thought.

Fiber channel uses an electrical encoding called 8b/10b encoding to carry data on the fiber (and copper, when used). It maps 8-bit symbols to 10-bit symbols in such a way that there won’t be too many 1s or too many 0s sent down the wire — they’ll roughly balance out. And much to my surprise, it’s not just because the designer had OCD and needed to make things come out even — depending on how the signal is carried, if there are too many 1s in a row, capacitors in the circuit that might be used for filtering and such can get charged up to a point that they’d interfere with the signal. So 8b/10b encoding means that for most of the 8-bit values (think ASCII table), there are two ways to encode each character. So, for example, when you send an ‘A’ down the wire, you can choose an encoding with more 1s or more 0s to make the bits balance out.

8b/10b encoding also provides for some extra slots for control codes like the arbitrated-loop loop initialization code and such, so that you don’t have to worry about escaping real data in case your control sequence shows up in a live data transfer.

What this means is that for every 8-bit byte you send down the fiber, it gets converted into a 10-bit byte, and 10 bits get sent for every 8 bits of real data you mean to send:

So when you do the math, instead of 1 gigabit fiber channel being 1024 / 8, it’s actually 1024 / 10 = 102 megabytes/sec, roughly. The line rates aren’t quite exactly on the gigabit mark, but the math is close enough to estimate. Remember that the gigabit/sec unit refers to the line rate — how many bits can go across the fiber in one second — and megabytes/sec refers to the amount of real data (8-bit data coming from or going to a disk) that can be passed in one second. So:

• 1 gigabit fiber = 1088 megabits/sec on the line = 108.8 megabytes/sec of real data (not 128 MB/sec)
• 2 gigabit fiber = 2176 megabits/sec on the line = 217.6 megabytes/sec of real data (not 256 MB/sec)
• 4 gigabit fiber = 4352 megabits/sec on the line = 435.2 megabytes/sec of real data (not 512 MB/sec)
• 8 gigabit fiber = 8704 megabits/sec on the line = 870.4 megabytes/sec of real data (not 1024 MB/sec)

(Per a reminder from MC: these are the theoretical top-end speeds you’ll see, and not what you’ll see in the real world. At least with ethernet, if you get 80% of the theoretical limit, you’re having a good day. Thanks!)

I hope this was at least a little useful, and perhaps even thought provoking. Okay, maybe not. But it’s a nugget of possibly useful information that I discovered recently, and thought I’d try to pass along.

References:

Technorati Tags: , ,

# fiber channel (fibre channel)

One of the things I want to start posting about is fiber channel (sometimes spelled fibre channel). It’s something I was first exposed to back in 1998, and have been dabbling with off and on since then (entirely at work, since it’s pricey stuff). In the past year I’ve been using it much more, and have learned quite a bit about it.

The biggest issue I have is that whenever I google about an issue, I either run into someone who’s using it in a database application, or someone who’s using it at a very low end in a small video configuration. This is a problem because neither of these scenarios fits our situation at work, and I have to experiment to figure out the solution to a problem, and more often than not it’s a shot in the dark — but it gives me an opportunity to really learn it and with luck impart more information about it.

The three-sentence description of fiber channel is basically this. Attaching a disk to a computer happens via some kind of storage interface — usually IDE/ATA (two disks max), SCSI (7 or 15 disks max), or SATA (one disk per port, more with port multipliers). USB and FireWire don’t count, since those convert one of IDE or SATA to USB or FireWire and then attach. Fiber channel is basically the result of someone saying “Hey Beavis, let’s take the disks out of computers, put them somewhere else, and then tie all the computer-to-disk connections together!”

The result of this is you get some of the great features of networks — with the right hardware, you can attach thousands of disks to a single computer. With the right hardware, you can transfer data screamingly fast. With the right software and hardware, you can add multiple links together and increase the speed of the connection. And with the right software, you can share a common set of disks between multiple computers.

The problem is that it brings with it the bad features of disk controllers. Most operating systems will scan for disks when they start up, and whatever they find, that’s what they expect to keep. They don’t like having new disks presented to them after bootup, and they *really* don’t like losing a disk that they found at bootup. Windows is a lot worse — it actually assumes that it owns any disk it sees, and writes a little tag to the beginning of each disk it finds if it doesn’t recognize the existing disk label.

I’ve spent the past year trying to wring performance and reliability out of several different fiber channel configurations at work, for video, database, shared storage, and other configurations, and have mostly succeeded, with lots of help from colleagues and vendors. I’ve learned a lot, and so has everyone involved; and I’ve not found a lot of references to most of the things we’ve learned, so I want to try and share it.

Next post on this topic — an introduction to our fiber channel switches.