-
NPI: A How To Guide for Engineers & Their Leaders
-
Leading from the Front
-
Building the Team
-
Screws & Glue: Getting Stuff Done
-
Choosing the best CAD software for product design
-
Screws vs Glues in Design, Assembly, & Repair
-
Best Practices for Glue in Electronics
-
A Practical Guide to Magnets
-
Inspection 101: Measurements
-
A Primer on Color Matching
-
OK2Fly Checklists
-
Developing Your Reliability Test Suite
-
Guide to DOEs (Design of Experiments)
-
Ten Chinese phrases for your next build
-
-
NPI Processes & Workflows
-
-
Production: A Primer for Operations, Quality, & Their Leaders
-
Behind the Pins: How We Built a Smarter Way to Inspect Connectors
-
Former Apple Executive Bryan Roos on Leading Teams in China and Managing Up
-
Leading for Scale
-
Navigating Factory Moves and Scaling Production in an Era of Uncertainty with PRG's Wayne Miller
-
Steven Nickel on How Google Designs for Repair
-
Petcube’s Alex Neskin Embraces Imperfection to Deliver Innovation
-
Proven Strategies for Collaborating with Contract Manufacturers
-
Greg Reichow’s Manufacturing Process Performance Quadrants
-
8D Problem Solving: Sam Bowen Describes the Power of Stopping
-
Cut Costs by Getting Your Engineers in the Field
-
Garrett Bastable on Building Your Own Factory
-
Oracle Supply Chain Leader Mitigates Risk with Better Relationships
-
Brendan Green on Working with Manufacturers
-
Surviving Disaster: A Lesson in Quality from Marcy Alstott
-
-
Ship It!
-
Production Processes & Workflows
-
-
Thinking Ahead: How to Evaluate New Technologies
-
How to Buy Software (for Hardware Leaders who Usually Don’t)
-
Adopting AI in the Aerospace and Defense Electronics Space
-
Build vs Buy: A Guide to Implementing Smart Manufacturing Technology
-
Leonel Leal on How Engineers Should Frame a Business Case for Innovation
-
Saw through the Buzzwords
-
Managed Cloud vs Self-Hosted Cloud vs On-Premises for Manufacturing Data
-
AOI, Smart AOI, & Beyond: Keyence vs Cognex vs Instrumentalpopular
-
Visual Inspection AI: AWS Lookout, Landing AI, & Instrumental
-
Manual Inspection vs. AI Inspection with Instrumentalpopular
-
Electronics Assembly Automation Tipping Points
-
CTO of ASUS: Systems Integrators for Manufacturing Automation Don't Scale
-
-
ROI-Driven Business Cases & Realized Value
-
-
Webinars and Live Event Recordings
-
The Apple-China Symbiosis and What it Means for the Future of Electronics with Patrick McGee
-
Get Me Outta Here! Racing to Full Production Somewhere Else
-
Tariff Talk for Electronics Brands: Policies Reactions, Reciprocal Tariffs, and more.
-
Materials Planning: The Hidden Challenges of Factory Transitions
-
Build Better 2024 Sessions On Demand
-
Superpowers for Engineers: Leveraging AI to Accelerate NPI | Build Better 2024
-
The Motorola Way, the Apple Way, and the Next Way | Build Better 2024
-
The Future of Functional Test: Fast, Scalable, Simple | Build Better 2024
-
Build Better 2024 Keynote | The Next Way
-
Principles for a Modern Manufacturing Technology Stack for Defense | Build Better 2024
-
What's Next for America's Critical Supply Chains | Build Better 2024
-
Innovating in Refurbishment, Repair, and Remanufacturing | Build Better 2024
-
Leading from the Front: The Missing Chapter for Hardware Executives | Build Better 2024
-
The Next Way for Reducing NPI Cycles | Build Better 2024
-
Scaling Manufacturing: How Zero-to-One Lessons Unlock New Opportunities in Existing Operations | Build Better 2024
-
-
Build Better Fireside Chats
-
Aerospace and Defense: Headwinds & Tailwinds for Electronics Manufacturing in 2025
-
From Counterfeits to Sanctions: Securing Your Supply Chain in an Era of Conflict
-
Design for Instrumental - Simple Design Ideas for Engineers to Get the Most from AI in NPI
-
Webinar | Shining Light on the Shadow Factory
-
Tactics in Failure Analysis : A fireside chat with Dr. Steven Murray
-
-
Preparing for Tariffs in 2025: Resources for Electronics Manufacturers
-
America's AI Action Plan Overlooks a Huge Problem: Building AI Servers Isn’t Easy
Estimated reading time: · copy linkBy Malathi Nayak
The White House unveiled Winning the AI Race: America’s AI Action Plan in July. It’s a blueprint to speed up data center permitting, expand skilled labor, and modernize the electric grid. But the plan sidesteps a critical issue: the difficulty of manufacturing AI servers at scale.
The deployment of fully utilized server racks in the U.S. must increase threefold in 2025, with additional growth required in subsequent years, as indicated by current projections. Amid a pressing AI infrastructure crisis, server makers are already struggling to meet the surging demand for fully loaded, hand-assembled server racks. There’s momentum in the U.S. tech industry and government to boost server production—yet significant technical challenges remain.
Complexity in Manufacturing
Product complexity is one issue: state-of-the-art AI racks typically contain nearly 30 individual server or switching trays—each with multiple circuit boards and hundreds of individually mated internal connections. The designs are getting denser, with more components per square inch, and as power requirements increase, complex liquid cooling systems are becoming more of the norm. After each tray is assembled, the functional testing for an individual tray can take 8-12 hours, before trays can be assembled into a rack and put through an additional battery of system-level tests.
A key bottleneck is low first-pass yield (FPY), says Marcy Alstott, a managing partner at OnTap Consulting, a firm that advises manufacturers on operations and supply chain issues.
FPY is a performance metric that measures the percentage of units that can be built without repair and retest. Server FPYs are typically 60–75% but can drop to 0–20% on very complicated systems or new models.
The more complicated a product is, the more likely it won’t work at the end of the assembly line, Alstott explains. She believes improvements upstream in testing and quality will help drive speed and cost-efficiency.
Building in Quality
Instrumental, a manufacturing technology company whose software improves FPY on tray and server lines, has found that top-of-the-line AI servers require hundreds of visual inspections. Simpler models might require fewer. Based on the company’s work, the top five technical defects causing low FPY in AI server assembly are:
- Unplugged or loose cables. Many servers have hundreds of individually mated cables that connect power and data. Loose connections can lead to intermittent overheating and communication failures – or eventually become disconnected after leaving the assembly line, resulting in failures out in the data center.
- Connector pins can be damaged during cable installation. A line operator can inadvertently damage a socket by installing a cable backwards, misaligning it, or applying too much force while positioning it.
- Thermal management issues. AI servers consume lots of power and run hot – many of the top models use liquid cooling instead of air cooling to eke out more performance. To keep the server from overheating, it’s important that conductive pads are assembled correctly, liquid cooling connections are made, and any leak sensors are aligned properly.
- Missing or damaged board components. The printed circuit boards that go inside AI servers are large, heavy, and floppy. They have lots of small components that are easy to damage during handling and assembly.
- Issues with pick-and-place caps. Plastic or metal pick-and-place caps on connectors must be manually removed. When an operator forgets to remove caps, they can be pushed down into the connector, causing damage or hindering the testing process.
These defects provide useful clues on areas where server makers must drive improvements, says Scott Jewler, a semiconductor industry executive who specializes in manufacturing business and operations. To meet the rising demand, server production should be “idiot-proof” leaving no room for technical defects, Jewler explains.
The challenge is that many of today’s top-of-the-line systems are competing on performance metrics like inference speeds and power draw – design for assembly is a distant secondary concern.
To gain an advantage in the race for dominance in the AI server market, Alstott says server makers must get quality right and improve FPY. “It's been proven that if you push your failures upstream and catch them early—so you’re sure all your solder joints are solid, your process is dialed in, and you've got Six Sigma control—then the chance that it’s going to fail later is much lower,” she said.
Ideally, “you're building in quality,” according to Alstott. “That’s the way forward.”