AMD Threadripper 2 2950x - 2x SLI 2080 TI FE - Fractal Design Define C SFF Mid ATX ( Fall 2018 )
Goal: Machine Learning / Workstation / VR / Gaming Machine
Target: 120hz @ 4k w/ Acer Predator x27 as the display
Size: <37L - Small Form Factor / SFF or Small Mid Tower
I built a top of the line Fall 2018 machine, and tried not to spare any expense but also needed to be realistic, so I spent money where it would help improve my ML research and saved money when it made sense. The machine wasn’t cheap, took a long time to receive parts (after several delays ), but I wanted to build an amazing machine that I could do bleeding edge work on. More on this later.
The hardest part was trying to fit so much power into such a small form factor. I wanted something smaller than my previous builds, and I felt like a full mid tower was just too large these days. I really wanted something that I could move around easily if I needed to. The completed the build weights 37.2 lbs, and is just under 37L @ 36.87L for a density of 1lbs/1L. As a SFF PC, it’s no Dan Case at 7.2L, or Cerberus X at 19.5, but this case was easier to work on and fit a 1500W PSU, 2 GPUs, and a crazy beefy Threadripper CPU. Between the CPUs and the GPUs there is 700W+ to dissipate, and more if I overclock any of the parts.
I think moving to a Cerberus X might be possible, but you would need blower style GPUs and you would end up thermally throttling at 260W+ each. Also, forget liquid cooling them with an AIO as the Cerberus X doesn’t have the room for AIOs.
Enter the Fractal Design Define C, it’s the perfect case for this build, it has good airflow, enough space for multiple AIOs if needed and it fit the Platinum 1500W PSU I chose. Did I mention that it’s super quiet too, with sound dampening panels through the case, and indirect air intakes.
On Oct-5th, 2018, I was able to score 11104 with 3DMark on Timespy Extreme, which placed me 19th in the Hall of Fame, but by today Oct-6th I’ve already slid to 22nd. I estimate before long I will be out of the top 100 as the professional overclockers take over. However, for now it’s a fun moment to be in the top 25.
Intel vs. AMD.
In previous builds I’ve used both AMD and Intel, and my last few have been Intel, but this time around AMD’s Threadripper seemed like an exceptional processor. I did plenty of research, and for me the deciding factor was the stability of the TR4 Socket, and the 64x PCIx lanes. Having a stable socket provides options for upgrading CPUs down the road, encourages 3rd party support and improves maintainability. Intel’s 7980xe is certainly a favorite on the 3DMark hall of fame, but it’s also 2x the cost of the AMD 2950x, and has 20 fewer PCIx lanes. As the build is today, I’m not using those lanes, but with Machine Learning, I might need a really fast set of drives, or more GPUs down the road. Having a full 64 PCIx lanes means I don’t have to replace as many parts when I need the extra bandwidth. Again it was a close call, but with AMD I got a more extensible platform and it was cheaper; win-win!
Taichi x399 ( motherboard )
As this will be a smaller ATX case the MB board was really important and I decided to go with a Taichi x399 from ASRock. The board seemed nearly identical to the ASRock Fatal1ty x399 board sans the 10x Gibit Networking card and the Killer Networking software, neither of which I need. The Taichi x399 also has 11 phase power which seemed sufficient for the TR2 and should accommodate a higher TDP if I need it later. For it's size it seemed like it was the only board that could one day support 4x SLI even though right now I'm targeting 2x SLI, and wasn’t an EATX size.
Silverstone Tek ST1500-Ti (PSU)
1500W may seem like a lot, or even overkill for a system that’s rated at 900W, but it’s really not. A smaller PSU like a 1000W or 1100W might have worked, but the largest SFX-L PSU is only 800W, and if you want any headroom for overclocking, hard drives, or additional GPUs you really need the extra wattage. Here is a list of PSUs I looked at
The Silverstone Tek ST1500-Ti was selected for its size, and Platinum efficiency rating. The Define C has a PSU depth limit of 175mm, but 180mm does fit. The extra 5mm only makes the cable area really cramped, and from what I can see the ST1500-Ti is just fine, but I’ve read that some PSU that are 180mm and have a fan grill might not work. (You might be able to push a 200mm PSU, in the Define C, but you would have to remove the drive cage, and you might not be able to mount a 3.5” HDD in the bottom fittings either.) I could have also gone for a less efficient PSU, but that would have meant higher energy bills, less usable wattage, and more heat to dissipate. With KWh in SF, CA costing between $0.12/KWh and $0.44KWh at peak. Spending a bit more on an efficient PSU can really save money in the long run or over 3 years. Tech companies assume that 1/3rd of the total cost of a PC is going to be it’s power consumption, so it’s not abnormal to consider this at all.
Machine Learning, SLI, and NVLink
In the last few years it’s become out of vogue to build an SLI machine as application and game support has been lacking. SLI was difficult to program for games, and offered little or no advantages for Applications. Some of my friends who had worked in machine learning before even built an SLI system only to find that it was hard to get libraries to support one application or model across 2 cards. SLI just wasn’t built for it.
NVLink might change all of that. On the Quadro Series of RTX cards NVLink is much more powerful with a 6x link, compared to the 2x link on the 2080 Ti cards, but a 2x link might get me what I want without the additional expense of spending $6,300, or $9,000 a card. Sure the Quadro cards might be faster or better, but I wanted to see if the 2080 Ti(s) could work first at a fraction of the price.
On the 2080 Ti(s) the NVLink has 50GB/s x2 which is substantially faster and drastically improves the scaling performance in games, and might even support applications using both GPUs as if they are 1 high performance Virtual GPU. Since the 2080 TIs have had a limited release till now only time will tell if Applications latch onto this.
As for Machine Learning, there are a few reasons to have 2 cards even if you can’t virtualize them. Running your models on one card, and an application or simulation on another card is a really good one. One of the most challenging parts of ML is getting good enough data to train your Model. NVidia used simulation data to build their DLAA / DLSS ( Deep Learning Anti-Aliasing / Deep Learning Super Sampling ), and RTX Ray Tracing Models. If you can use high quality simulated data you can feed a lot more data into your models to tune them and make them smarter. NVidia is making big claims based on the performance of their models, so i can only assume this is the future of ML.
The Threadripper CPU and the 2080 TIs are beasts and put off a ton of heat under load. I have an NXZT x62 Kraken 280mm AIO for the CPU, and the stock coolers on the Founder Edition cards. Under a light load ( like writing this ), the TR 2950x runs at 62C, and the two GPUs run at 38C & 45C. All of these will throttle up to 80-85C under heavy load. They just run hotter than I’m used to, and part of that is the fans trying to run quite. I’m also wondering if I’ll have to upgrade my cooling options. The Enermax LiqTech TR4 280 & LiqTech TR4 II 280 have a larger CPU block that covers the whole TR4 CPU plate, whereas, the x62 is a cylinder and doesn’t cover the whole plate. I also think the Enermax AIO has a higher flow rate, however, I ended up getting the x62 because it looks so much better in photos :-P. I think the Enermax would be necessary for better overclocking headroom. I’m also curious if I can fit a small blower fan over-top of the GPUs to directly vent their heat. I found a few online, but I am not sure how I would quite mount them just yet.
In general I love the new system, and I’m totally head over heels with GSYNC-4k@120hz on the desktop and in games. Everything is butter smooth on full settings. Applications feel snappy and lightweight. It might be 2-4 months before I have any deep learning work to show off, but I intend to try to replicate a few existing projects in the meantime. ( i’m open to suggestions )
I’d love to hear your comments and feedback, and I’m curious what others are doing in the 30-40L case SLI range for cooling.
I have 2 non-obvious customizations, a remote control and I upgraded the WiFi module for the Taichi X399. The remote was so that my 18mo daughter could not easily turn the computer on and off with the front panel.