Update Your Product Reliability Targets
Following up on my column on Durability, Derating, and Circularity, I had a conversation with Fred Schenkelberg, an expert reliability consultant I have known and respected for many years. I wanted to get his insights and advice for engineers who are going to be tasked with designing products that must last longer than they have previously been designed to last. We focused on derating, in particular, because I believe it is an important and often unimplemented process in electronic product design. But it’s only part of the solution.
Fred is very knowledgeable; he had recently produced a webinar on “Deliberate Reliability Testing” and made the point that just because your customer wants to see 1,000 hours of life test with zero failures, that should not be the end of what you are doing as a reliability engineer: “if you’re not seeing failures, you’re not learning anything” he said. This is true: zero failures means you don’t know how close or how far the product is from the wear-out part of the bathtub curve or what other issues may be uncovered. This becomes increasingly important as your customer — and your customer after that customer — continues to use the product long beyond whatever that 1,000 hours may represent in terms of real-life use.
When I started as a quality/reliability engineer (“QRE”) at Intel several decades ago, we ran reliability tests on product prototypes from the corners of the semiconductor manufacturing process in order to validate that the specific design did not inadvertently violate design rules and to confirm its robustness. Since then, robustness has been designed into the semiconductor process itself and is validated at that level, while design rule checking is baked into EDA tools so you’re less likely to ship an unreliable product that is based on an otherwise reliable semiconductor process. However, use-life assumptions are still made at the process level and implemented in the design rules.
For instance, while maximum current density is specified for a given trace in a given conductor layer within an integrated circuit (IC), the calculation that led to determining what that maximum value should be is based on, among other things, assumptions about the number of active use hours the IC containing that trace would see. If that assumption is now too low due to new use models driven by product reuse, then perhaps the IC will not be appropriate for your application or the application will have to run the IC at a lower voltage and slower speed to meet the required failure rate.
Every manufacturer of components, active or passive, has — or should have — done these types of reliability assessments to ensure that the products they are selling will meet or exceed the reliability needs and expectations of the OEMs incorporating them into their products.
Part of the OEM or system-level engineer’s task now is to determine what parameters of their product change, and what the resulting new requirements are, due to its new use case in a circular economy. These changes may well affect the actual use lifetime of the product and its components and materials.
As product use lifetimes are extended due to the circular economy and reuse, durability (and its analog reliability) becomes even more important. Products must be designed to last longer and to be more repairable. In our discussion, Fred pointed out that “not everything fails because it’s not derated” – he is right that derating is no panacea. There are numerous other factors, he said, including design failures, counterfeit parts, etc. For example, he said that a third of field failures are design-related: manufacturers failed to anticipate how customers would use a product!
Regarding design failures, Fred noted that mechanical engineers get taught to design margin into products (e.g., model, simulate, and address, as necessary, known common failure mechanisms like creep and fatigue), but electrical engineers generally do not. This is where derating comes in — Fred says it adds resilience and robustness to a design. However, when a design is outsourced (which is increasingly the case these days) unless the Product Requirements Document, or PRD, specifies margin analysis and derating you may not get as robust a product as needed.
Placing a stage, or “meta-layer”, of design robustness research and planning on top of, or in front of, the existing product development process will enforce this assessment. Teamwork is critical: design engineering must work with procurement/commodity management, component engineering, reliability engineering, marketing/program management, and others with support from executive management to initially define the product durability/reliability requirements based on product goals. These have to be translated into design-and commodity-specific requirements, then implemented in the design and supply base selection phases of the project.
That means that if, in the past, you were, for instance, asking for 1,000 hours of life test with zero failures for ICs, now would be a good time to have more in-depth discussions with your component manufacturing suppliers about how the reliability testing they do relates to the operational and parametric lifetime of their products and how that would relate to your new modular, reusable, and repairable product’s longer use life. This additional level of discussion with the supply base will help you identify adequately robust devices — and suppliers who will support you and build those devices — for your longer-lived product.