Building a rig for AI workloads requires a different approach than traditional mining setups. While mining rigs could be optimized with basic hardware and minimal power management, AI systems demand robust CPUs, ample memory, and high-speed connectivity to maximize GPU performance. The key to success lies in carefully selecting components that ensure scalability and efficiency, particularly in high-performance, multi-GPU configurations.
Differences
When building a mining rig, the requirements were fairly straightforward. High-end CPUs, large amounts of memory, or extensive storage weren't necessary. All you needed was enough power, a basic system that could boot, and the ability to connect multiple GPUs via PCIe Gen 1/2/3 x1 slots. However, building a system for AI workloads involves entirely different considerations.
For AI tasks, you need a powerful CPU with enough PCIe lanes to support multiple GPUs running at PCIe Gen 4 x16 speeds. This demand often leads builders to the server-grade hardware realm, with AMD EPYC processors being a popular choice. AMD EPYC CPUs typically offer 128 PCIe lanes, allowing for configurations that can accommodate up to eight GPUs, each operating at full PCIe Gen 4 (or even Gen 5) x16 bandwidth. Once the GPU connectivity is addressed, selecting the most powerful CPU your budget allows is essential to ensure optimal performance.
When it comes to memory, a general rule of thumb is to have at least twice the system memory as the combined VRAM of your GPUs, though more memory is always beneficial. For storage, a similar rule applies: allocate roughly 1TB of storage per GPU, but again, increasing this capacity can improve flexibility and performance in many scenarios.
You'll also need to design the power and wiring setup of your rig with the GPUs' full TDP (Thermal Design Power) in mind. Unlike mining, where overclocking and power limiting are common practices to optimize efficiency, AI workloads demand maximum performance. If you plan to rent out the rig for AI tasks, you'll need to ensure it can operate at full capacity without power restrictions, as any throttling could impact the quality of service. This means investing in a robust power supply and properly rated wiring to handle the increased load safely and reliably.
Hardware
We strongly recommend investing in proper hardware from the beginning rather than trying to cut costs and risking regret later. Building a system capable of supporting eight GPUs with full PCIe x16 lanes requires careful selection of components, particularly the motherboard.
For a PCIe Gen 4 system, we suggest the ASRock Rack ROME2D32GM-2T, which offers the necessary bandwidth and connectivity for high-performance AI workloads. If you're aiming for a PCIe Gen 5 setup to future-proof your system, the ASRock Rack GENOA2D24G-2L is an excellent choice, providing the latest standards in PCIe speed and compatibility. Both motherboards are well-suited to handle the demands of multi-GPU configurations with server-grade reliability.
ROME2D32GM-2T
The ASRock Rack ROME2D32GM-2T is an excellent choice for PCIe Gen 4 systems. It features dual CPU sockets compatible with AMD EPYC 2nd and 3rd generation processors (7002 and 7003 series), providing robust performance and scalability. With support for up to 32 DIMM slots, this motherboard can accommodate all the memory you'll need for demanding AI workloads.
For GPU connections, the motherboard uses SlimSAS ports, with two ports required per GPU to achieve full PCIe x16 bandwidth. If you use only one port, the bandwidth will be limited to PCIe x8. Based on our experience, we highly recommend using 10Gtek cables and C-Payne device adapters, which have proven reliable in multi-GPU setups.
Additionally, the board comes with dual 10 Gbps NICs, a valuable feature for high-speed networking and data transfer in server-grade systems. This combination of features makes the ROME2D32GM-2T a solid and versatile choice for AI-focused builds.
GENOA2D24G-2L
The ASRock Rack GENOA2D24G-2L represents the next evolution, moving from a Gen 4 to a Gen 5-based system. This motherboard also features dual CPU sockets but upgrades to the SP5 socket, supporting AMD EPYC 4th and 5th generation processors (9004 and 9005 series), delivering cutting-edge performance for demanding AI workloads.
With 24 DIMM slots, it provides ample memory capacity, but it's essential to consider the cost implications of DDR5 memory, which is significantly more expensive than DDR4.
For GPU connectivity, the GENOA2D24G-2L transitions from SlimSAS to MCIO connections. Like SlimSAS, achieving full PCIe x16 bandwidth requires two cables per GPU, while a single cable will operate at PCIe x8. Again, we recommend 10Gtek cables and C-Payne device adapters, which have been highly reliable in multi-GPU setups.
One drawback of this motherboard is its networking capabilities. Unlike its predecessor, it only comes with onboard 1 Gbps NICs. If you require higher-speed networking, you'll need to invest in external NICs to achieve the desired performance. Despite this limitation, the GENOA2D24G-2L is an excellent choice for building a future-ready Gen 5 system.
Assembly
Conclusion
Designing an AI rig involves balancing performance, scalability, and future-proofing. Investing in server-grade motherboards like the ASRock Rack ROME2D32GM-2T for PCIe Gen 4 systems or the GENOA2D24G-2L for PCIe Gen 5 ensures the connectivity and processing power required for demanding workloads. Proper planning around memory, storage, and power ensures the system delivers consistent, full-capacity performance, making it a reliable investment for AI applications.
AI was used to help create this content.
Written by Marius L
The creator/owner of Hashrate.no goes by the alias r0ver2. Has a long experience with GPU mining and mining in general. After starting with home mining in 2017, slowly building up the mining operation while gaining experience and knowledge - he joined SimpleMining's support team in 2020. Also been an active supporter of mmpOS since 2021 - and part of the testing team for lolMiner since mid-2021.