An open letter to Gigabyte, Asus, Asrock…

An open letter to Motherboard manufacturers, specifically Gigabyte because they’re my favorite, and I have very little hope for Asus after their Z77 and first wave Z87 boards…

Why do you hate the micro-ATX format so much? And why do you hate thunderbolt?

Gigabyte released three Z77 boards with thunderbolt (and zero at launch for Z87, though a couple have been promised) and the only mATX board was utter trash, and never available in the USA. Why?

(Oh the irony after my mac pro posts, right? Thunderbolt is still a great technology for home users, I just think it’s incredibly presumptuous, condescending and downright stupid for Apple to push it as the only expansion option on a “Pro” class workstation. Thunderbolt is not a viable alternative to internal pci-express expansion.)

Anyway, back to my open letter. Motherboard makers, how hard is it (be honest now!) to make a micro-ATX motherboard with the following:

  1. Decent VRM (all-digital, at least 8 true phases for CPU, preferably IR3550 60A powIRstages.) Honestly Gigabyte is already an industry leader here, just don’t skimp on the VRM because the board is mATX.
  2. Dual thunderbolt ports (falcon ridge/TB 2.0 would be ideal, but 1.0 is fine for most users today.)
  3. Usable pcie slot layout (top to bottom: x16, x1, x16, x1) with a PLX 8747 chip for the x16 slots. I’m aware that the PLX chip is overkill, it’s here to future-proof the board for upcoming/next-gen GPU’s (e.g. AMD 8000 and Nvidia 800 series) in case they finally saturate/surpass pcie 3.0 x8 link speeds, in which case I’ll take the latency hit from the PLX chip. Huge bonus if they can implement auto-sensing to disable the PLX chip if only one GPU is installed, but it must be without adding a dead/unusable pcie slot. Micro-ATX boards can’t afford to lose pcie slots for the kind of switching implementation used on the Z77X-UP7 or new Z87X-OC-Force boards.
  4. Decent quality, isolated audio design (e.g. Gigabyte AmpUp!, etc.) but please just use Realtek ALC898 or ALC1150 codecs. Seriously, go back to the implementation used on the GA-Z77X-UP5-TH; that was probably the best on-board audio ever, so add a headphone amp with an op-amp socket to it.
  5. Intel brand Ethernet/NIC only. Killer might make for great marketing to the “gamer” audience, but for 99% of users, it’s not any better than intel, and can cause compatibility problems with Linux and OSX.
  6. No extra third-party USB 3.0 controllers. Instead, use USB 3.0 hubs like you did on the GA-Z77X-UP5-TH; this again is to ensure easy compatibility with Linux, OSX, etc.
  7. No extra third-party SATA controllers, and no eSATA either (that’s why we have thunderbolt!) Six native SATA 6Gbps ports from the Z87 chipset is fine, don’t waste money on extra SATA, instead spend that on…
  8. Better fan/cooling support. I want to see only PWM fan headers, and all with independent BIOS control. Be sure to include at least 5 headers, though 6 or 7 would be ideal. This is huge for anyone that runs multiple Operating Systems since the included software fan control utilities are almost never released for Linux or OSX, and third-party tools don’t always work, leaving only BIOS control.
  9. For back panel I/O, please don’t include a VGA port, nobody should be using that anymore, so don’t encourage it. Instead, include 1x HDMI, 1x DVI, and the two thunderbolt ports. A HUGE bonus would be some kind of display-port input (two ports to match the dual thunderbolt) to feed discrete GPU signals to the thunderbolt ports, but I understand this might cause problems with intel’s certification. (Apparently only apple is allowed to route disgrete GPU signals over thunderbolt, yay double-standards!)
  10. Also on the back panel, please include clear CMOS, dual-bios-switch, and boot-to-bios buttons. These are handy for a lot of reasons, and when you suddenly need them, it’s a royal pain to have to open your case for board-mounted switches. This can be exacerbated for mATX users, since the downside to most small cases is cramped internals, possibly blocked by GPU’s or other expansion cards, drives, etc.
  11. Please include some kind of USB BIOS flashing/updating method akin to Asus’s “USB Bios Flashback.”
  12. This one is really a personal preference (arguably with limited market) but why not include pcie x1 riser cables in the box so people can actually use the slots that would be blocked by most GPU’s today?
  13. And please for the love of God, do this all on a board with a non-garish color scheme. For Gigabyte, how about an matx board with the new GA-Z87X-UD4H color scheme?

The board doesn’t need to be cheap either, don’t skimp on features or quality to keep cost down. There’s clearly a market for high-end, feature rich mATX boards, and a board with the above features could retail in the $250+ USD price bracket, and still sell, especially if they managed to implement some of the obscure suggestions like PLX auto-sensing, or displayport input for discrete GPU signals over thunderbolt. I’d buy it, and I’m sure a lot of others would too.

Hell, put the board on kick-starter if you really need to gauge market interest, I’ll personally pledge right now to back it up to $300 USD.

The GA-Z87X-UD4H board I mentioned above. Imagine an mATX version of that, but with dual-thunderbolt ports, better audio, better fan control/cooling support, and a PLX bridge to better future-proof the board for forthcoming next-gen GPU’s in SLI/Crossfire.

EDIT: I’d even take the Asus z87 Gryphon board if their “ThunderboltEX/Dual” card wasn’t turning into vaporware (likely due to intel denying them official certification.) So come on Asus, give us a Gryphon with Dual thunderbolt ports, but don’t just trash the bottom slot completely when you route the pcie lanes to the thunderbolt controller; find a way to at least give us an x1 slot there.

And lastly, an ASRock option. I don’t care for the color, but trash the on-board mini-pcie/msata/ngff slots, add dual-thunderbolt, and find a way to keep at least an x1 slot on the bottom, and I’d take it.

Really my point is that there’s plenty of companies today making highly, HIGHLY specialized micro-ATX motherboards, but they all hate thunderbolt for some reason.

new mac pro, part two

So after countless mind-boggling threads over the last two days, I’ve realized most people have no clue how important underlying technologies work, and accordingly, I forgot to mention quite a bit in my critique of the new Mac Pro, so here’s part two!

To alleviate fears of zero expansion options, Apple is going to ship the new Mac Pro with six Thunderbolt 2.0 ports, so let’s recap a little on thunderbolt. Thunderbolt is basically an external pcie interconnect, and it runs at pcie 2.0 x4 link-speeds. Thunderbolt controllers can present one or two ports, depending on the controller used, though there’s little to no technical data on how the controller is “doubling” bandwidth to present two ports, my best guess is built-in, intel designed, pcie multiplexing akin to popular PLX 8747 chips. Thunderbolt 2.0 and 1.0 run at the same pcie link speeds, though 2.0 combines previously separate data and display channels into a single bi-directional channel to effectively double bandwidth from 10Gbps (or ~1GB/second) to 20Gbps (~2GB/second.) You can expect that 2.0 will require all new cables, and all new peripherals, else the ports will almost certainly drop back down to 1.0 compatibility modes, or simply fail to work with 1.0 peripherals at all.

Now, an Intel socket 2011 CPU exposes 40 pcie 3.0 lanes to the host system. We have to assume that apple is connecting both AMD GPU’s via dedicated pcie 3.0 x16 links (I sure hope they are at least! EDIT: this is an important point to clarify, as many “enthusiast” or “gaming” computers will run dual-GPU’s at pcie 3.0 x8/x8 links, and have no issues with gaming or daily use. However research has shown that, specifically with AMD GCN/79XX/W9000 at least, for GPGPU/Compute heavy workloads, dropping to x8 links lowers performance significantly. This would be flat out unacceptable for a “Pro” or “Workstation” computer.) And that leaves us with 8 pcie 3.0 lanes to distribute to the rest of the system. Based on the reported “1250MB/second” speeds of the pcie SSD apple is using, we have to assume that 2 pcie lanes are being routed for the SSD (a single pcie 3.0 lane can’t break 1000MB/second.) That leaves 6 pcie lanes leftover for thunderbolt controllers (I’m ignoring anything that might be exposed via the X79/C606 PCH for three reasons; first, PCH lanes operate at pcie 2.0 or 1.0 speeds; second, it seems clear Apple is leaving the PCH mostly intact with dual 1Gbe NIC’s, USB 3.0, on-board audio, etc. and simply disabling the SATA/SAS controllers; and third, using all native pcie 3.0 lanes, directly connected to the CPU is the best possible scenario for performance.)

While there’s no technical data available for falcon-ridge (thunderbolt 2.0) we can assume that intel has both single-port and dual-port variants like they did with previous generations. So now we need to try and figure out if apple is using 1-port or 2-port controllers for the new Mac Pro, and it’s immediately clear that apple is most likely using dual-port controllers. This is due to the fact that we have 6 pcie 3.0 lanes (equivalent to 12 pcie 2.0 lanes) leftover, exactly the number required to run three dual-port controllers (based on previous generation requirements.) So we have to expect that all six thunderbolt ports will have added latency from whatever internal multiplexing intel does on dual-port controllers.

Ok, all of that out of the way, let’s actually get back to some of Apple’s new reality-distortion-field marketing! Apple claims that the Mac Pro can support three 4K displays at the same time. Ok, well, I’m going to assume they mean Ultra HD (3840 x 2160) resolution and not Cinema 4K (4096 x 2160) even still, the bandwidth required to drive a single Ultra HD display, at only 60Hz, is ~1.5GB/second. In other words, if you want to use a 4K monitor with the new Mac Pro, you won’t want to daisy chain any device between the computer and the display, else you might bottleneck the display of the bandwidth it needs, especially since there’s already added latency from the dual-port controller.

EDIT: To add to the above, this assumes that the “4k” displays are all thunderbolt 2.0 compliant, since thunderbolt 1.0 lacks the bandwidth necessary to push 4k @ 60hz.

And now if we take that “three 4K displays” claim a bit further, and we see that of the six thunderbolt ports, we now have only three left for actual expansion. So far this seems reasonable enough, though it’s questionable if even dual FirePro W9000 series GPU’s could truly drive three 4K displays without problems, especially while trying to perform GPU heavy workloads concurrently.

Next assume that you need a thunderbolt video capture unit, the best I can find is the Blackmagic Design UltraStudio 4K. This is also only thunderbolt 1.0, and thus another device you won’t want to daisy-chain since uncompressed Ultra Hd @ 24p is ~900MB/second (Cinema 4K @ 24p is ~955MB/second) and thunderbolt 1.0 has max transfer speeds of 1GB/second. That takes us down to 2 ports. Lastly, let’s assume that you need 10Gbe network connectivity to deal with all the uncompressed 4K footage you’re editing that you can’t store locally. This is another thunderbolt 1.0 device, and thus another you can’t daisy-chain since a 10Gbe adapter requires all the bandwidth of a thunderbolt 1.0 port. And that leaves us with a single thunderbolt port for all other connectivity (better hope you don’t have legacy firewire 800 drives with your music, photos, etc., else there goes your last port for an adapter, or you need new USB 3.0 or thunderbolt drives.)

Now, here’s the real interesting part. Going back to the fact that all six thunderbolt ports most likely run over only six shared pcie lanes. Assume all 5 of those ports in-use are operating at near capacity, that’s 1.5GB/second for each display (so 4.5GB/second total) and another 1GB/second for the capture card, and another for the 10Gbe NIC. That’s already 6.5GB/second. Now assume you actually have all 2.0 compliant peripherals, and all six ports can operate at full 2.0GB/second, that would be 12GB/second, all running over 6 pcie 3.0 lanes, with a hard bandwidth limit of ~6GB/second. Yep, absolutely no way of avoiding that latency and potential bottlenecking, both very valid (and worrying!) concerns for would-be workstation class computers.

Anyway, I just hope this causes someone to see through Apple’s reality-distortion-marketing and look at the Mac Pro objectively. It’s an “epeen” toy for rich people, or people that have absolutely no choice but to use Apple hardware due to contractual obligations, corporate investments, etc.

the new mac pro…

So yesterday was the opening day (and keynote presentation) of Apple’s “WWDC” 2013. Among other things (like trying to spin OSX playing catch-up to Windows as new features; e.g. GPU scaling, functional multi-monitor support, and new low-level CPU optimizations) they offered a sneak peak at the forthcoming Mac Pro refresh.

I suppose I shouldn’t have expected much, this is the “New Apple” after-all, but it seems like they’re just completely ignoring the “Pro” market entirely.

The new Mac Pro:

Now, I can appreciate the technical feat of cramming an intel socket 2011 CPU and dual-GPU’s into such a small form factor (9.9″ tall x 6″ diameter) but I’m at a loss trying to find a market for this other than “People with more money than sense.”

First, the entire system looks to be 100% custom PCB designs, you will never be able to upgrade your CPU or GPU’s, you might not even be able to upgrade the SSD if Apple pulls the same crap they’ve done before and designed their own connector instead of using the M.2/NGFF standard. And don’t forget that there’s zero expansion slots for any auxiliary pcie devices like raid controllers, network cards, video capture cards, audio mixing units, etc. And well, you better hope your apps/workflow don’t rely on Nvidia CUDA tech, because I seriously doubt that Apple convinced both AMD and Nvidia to bend over backwards and design custom boards for a single new computer with limited market potential, and quite frankly I don’t think Apple has the technical expertise/experience or the desire/motivation to design custom GPU boards from scratch. And lastly the CPU. I applaud Apple for finally moving to intel Socket 2011 platform, but I’m dumbfounded by their choice to use a 12-core chip instead of two 6-core or 8-core chips. Anyone that understands CPU’s knows that in order to ensure those 12-cores on the same die don’t fry each other, the chip will run at a very low clock-speeds, likely 1.8Ghz-2.4Ghz per core, whereas a dual processor system with two 6/8-core chips could see 12/16 cores (24/32 threads) all running at 3.1-3.8Ghz. (See the intel Xeon E5-2687W for reference.)

So who is the system for then? Audio engineers? Why would they need dual fire-pro GPU’s? Mechanical engineers? Well, maybe for any apps that don’t rely on CUDA. Maybe 3D artists/designers/animators? Again, maybe for apps that don’t rely on CUDA (no 3Dsmax, Blender etc.) Maybe professional video editors? Apple has pushed the Mac Pro for the editing market for a long time, so let’s see!

Straight away we see that for professional editing the machine is flat out lacking the storage necessary to work on uncompressed video. Let’s look at some common video formats and how much disk space is required to store 5 minutes of footage in each format:

  • 1080p/24p (1920×1080, 24fps, 4-4-4, 12-bit color) – 67GiB
  • 1080p/60p (1920×1080, 60fps, 4-4-4, 12-bit color) – 167GiB
  • 1080p/3D (1920×1080, 72fps, 4-4-4, 12-bit color) – 200GiB
  • Ultra HD (3840×2160, 24fps, 4-4-4, 12-bit color) – 250GiB
  • Cinema 4K (4096×2160, 24fps, 4-4-4, 12-bit color) – 267GiB
  • RED ONE (4520×2540, 60fps, RAW bayer mask color) – 289GiB

Assuming that Apple allows up to ~1TB of SSD storage in the new Mac Pro, that’s ~970GiB (GiB = Gibibyte, unit used to represent base2 formatted computer storage, vs base10 used in most marketing.) Let’s assume that ~70GiB goes to the OS, Apps, Library files, temporary files, etc., and that leaves ~900GiB of “scratch” space to work on files.

That’s only ~17 minutes of uncompressed 4K/24p video and the disk is completely full, and that ignores the fact that a full SSD is considerably slower than rated speeds (a general guideline is always leave ~20% free space on your SSD to ensure consistent performance) and will in turn negatively impact the entire system performance.

I know what you’re going to say now though, that’s where thunderbolt comes in!

Well, yes and no. It’s true that thunderbolt is almost an external pcie interface, but each port (even version 2.0) is limited to the bandwidth of a single pcie 2.0 x4 link, meaning a real-world throughput of ~2GB/second. Compare that with an internal pcie 3.0 x8 link (the interface used by most high-end RAID controllers) at ~8GB/second; four times the bandwidth of thunderbolt 2.0. So for truly high-speed storage (like you’d want and need to comfortably edit 4K video) thunderbolt just can’t match a local/internal RAID array.

And let’s not forget that thunderbolt 2.0 peripherals aren’t even close to market yet, and when they’re released, you can expect a disk-less RAID array to be ~$1500-2000+ for the first couple of years. Ok, so how about using a network storage solution? Surely $2000 could build a capable, high-performance NAS/file-server, even with a 10Gbe connection. True, it most certainly could, except there’s no internal expansion of the new Mac Pro, and it doesn’t have a native 10Gbe connection. And 1Gbe can’t handle the bandwidth to do real-time editing of even 1080p uncompressed video. So back to thunderbolt, yes, it can handle a 10Gbe card from a bandwidth standpoint, but the cheapest one I can find is $1000 (compare with internal 10Gbe cards as low as $275 now) and being an external breakout box will need it’s own desk space, cable, etc. And of course, since it’s only designed for thunderbolt 1.0, you won’t want to daisy-chain the 10Gbe adapter, because you don’t want the added latency, or potential bottlenecking from sharing bandwidth.

And lastly, this all ignores the fact there’s very few thunderbolt capture cards on the market, even fewer that support 4K, and those that do (for professional level work) need an external rack to mount them, limiting any semblance of portability you may have gained with your tiny new Mac Pro. Wow, this new setup is so much better than dropping a couple pcie x1/x4/x8 cards into expansion slots.

So I just don’t see a market for this computer, I truly don’t. It will be another toy for rich people to flaunt their money, but it won’t find traction in real professional environments.