A shorter version of this article was originally published on Forbes.com.
When the fourth generation iPod Touch shipped, I was giddy. It was the first product I had ever worked on, and the team had poured their heart and soul into making it the best product we could. I was excited to hear about how the product we designed was doing in the field, and volunteered to do early field failure analysis to help understand returns from the field and potentially make improvements to the product. For a few weeks, samplings of potential mechanical failures were shipped to my desk, and it was as fascinating as it was bizarre. I got a “sticky button” return that arrived still covered in the ketchup that likely caused the issue. I got “water damage” units with dried remnants of Sprite and Coke inside. There was one unit that looked like it had been run over by a car. As an engineer, it was awesome to see “how we did”, and more importantly, how we could do better.
This visceral experience early in my engineering career led me to believe that customer returns are a priceless commodity for companies who want to make real impact on the bottom line: in the current product and future ones. Units that are returned, usually through an RMA (return merchandise authorization) for any reason, accumulate into a key manufacturing metric known as TWR, or total warranty returns. TWR on its own can be a tricky number – by definition it includes all returns, including buyer’s remorse, which for a typical program can account for between 70% and 95% of the pile. In every pile of returned units, however, there is also some portion that was preventable, either because there was a software glitch or a hardware issue. That portion of the pile is worth its weight in gold. Savvy companies can act quickly and use these early field failures to course-correct a product that is already shipping, achieving margin improvements in the short term. They can also use this analysis to identify opportunities for improvement in future generations of the same or similar products. Since most R&D engineers start working on the next generation product very soon after mass production starts, the early field failure data is not only pertinent, but timely.
Instrumental recently conducted a survey of quality and engineering leaders across the electronics, medical, and appliance industries. As I reviewed the results, I was appalled that many companies lack a formalized program to learn from products that recently started shipping, unless the product has a huge flaw that is only discovered in the field. While most leaders cited TWR targets between 1.2-2%, when asked why they did not have formalized programs to do early field failure analysis, the main response was that their organizations weren’t evaluated on TWR as a key metric. A VP of Operations at a publicly-traded consumer electronics company told us, “We get back 4% of everything we ship. It’s not due to quality because we track yields at the factory.” When we asked what percentage of the 4% were preventable defects that had escaped the factory, he said, “We don’t track that.” While 1.2-2% seems to be the norm, nearly every leader had a horror story. One leader shared that a particular coffee maker that is currently shipping has a replacement rate of 15%. This is astronomical, and gets at the heart of why TWR is so important: when distributed across all units, returned units can account for dollars of the margin, in a business where people are hired to negotiate out fractions of cents. A $100 coffee machine selling at 50% margins with a 15% replacement rate means that every product has $7.50 worth of cost added to make up for the defective ones. Even small improvements to that number will make a big difference to the bottom line at volume. In this case, engineers were working on a fix, but in a lot of cases, engineers are immediately pulled onto the next development program, and TWR becomes a metric that a quality lead monitors and tries to push on, albeit, usually without effective tools or resources. In our coffee maker example, even a TWR of 2%, which might be considered meeting the target, is still adding $1.00 to the cost of every unit to cover the defective ones. This is a big missed opportunity.
In its research, Instrumental identified two critical gaps that explain the disconnect between executives and TWR. The good news is that it’s possible to fill both gaps and put best practices in place to leverage existing product data to reduce defects, increase margins, and keep customers happy.
The first gap is knowing what problems exist in the pile of returned units. Specifically, engineers need access to granular, engineering data. Executives won’t spend engineering time on fixing theoretical issues after a product has shipped, and data can make them real. Getting accurate and granular data comes in two steps. First, returned units with potential software and mechanical failures need to be separated by serial number from the buyer’s remorse units. This should be easy: just ask the customer why they are returning the product. Unfortunately, most products are sold through sales channels like Best Buy or Walmart, and not every channel can collect this data. In that case, looking for which units were replaced might give a clue that the customer wanted the product to work. If some portion of the product is sold direct to the consumer and communication takes place through company-operated support channels, it’s possible to get this information and to start to figure out which units are worthwhile to investigate. Getting to this point is good, but don’t stop here. In order to actually close the feedback loop, the specific failure symptoms need to be carefully and accurately categorized and reported on (this is step two). While there could be multiple potential root causes for “will not power on”, “button issue”, “no wifi”, or “enclosure damage”, having well-structured categories enables engineers to identify the biggest problems. Once there’s a pareto chart of the top symptoms, it’s possible to capture targeted samples for engineers to analyze what went wrong. This second part isn’t that difficult, but most companies do not do it well or proactively, instead jumping into panic mode when return rates or customer sentiment reach fever pitch. Getting engineering-resolution data from returns is straight-forward, but does require some thoughtful up-front effort. In order to get units into the right categories, engineers and customer support professionals need to work together to build detailed troubleshooting trees. These lists of questions that start broad and get more and more specific based on how the customer answers, enables minimally trained support staff to leverage customers to correctly classify failure symptoms. Once collected, this data can then be aggregated and fed back to engineers to digest and to act on.
The first gap was a gap in information and data, the second is a gap in incentives, which depending on the organization either makes it easy or difficult to fix. The reason many engineering leaders are nonchalant about an extra dollar when they are fighting for cents is that TWR isn’t one of their costs. Based on survey data, it is common that the cost for the replacement units and returns is charged to a group far removed from the engineering and production organizations that can actually affect change on those numbers (some didn’t even know who paid). While it might be a different department, the returns still cost the company money and affects margins, customer satisfaction, and in some cases even future sales. To eliminate misaligned incentives, the best practice is to put the TWR estimate as a cost on the bill of materials (BOM) for the product. The total BOM cost is a metric that engineering, operations, and production are all responsible for hitting. In the other arrangement, engineering only gets involved when there is an unusually high TWR or a safety issue, but when TWR is on the BOM, engineering gets involved every time, creating the opportunity to improve margins for this product cycle, and learn something that enables a better product the next time around.
Having a process to close the loop on field performance data is a clear competitive advantage for a product company: in addition to having fewer returns over time and therefore higher margins, fewer customers will have issues. Getting the data takes discipline and investment, but the return on investment for a 0.5% reduction in TWR is not only measurable, it’s likely to be huge. This is the kind of problem that really gets us fired up at Instrumental: how can we help our customers make better use of the data they already have (or can reasonably collect) to close the loop on their products and processes – so they can build better. It’s not just theoretical: supporting early field failure analysis has been a core use case for Instrumental customers for over three years. After customers have used our technology to improve their product quality and yields during development and rolled into MP, the 100% traceability record becomes a buoy against downside. When Pearl Auto discovered that a portion of their units may have been built with bad parts, they used Instrumental to reduce the size and duration of quarantine while they determined a new specification. Other customers used it to isolate production process issues, quickly react to customer complaints, and to validate that issues seen in development didn’t come back. The manufacturing world is abuzz about Industry 4.0, but there’s no reason to wait to solve these problems: it’s possible to put best practices in place to learn from field failures and make improvements that result in significant financial savings for companies today.