United States - Flag United States

Please confirm your currency selection:

Bench Talk for Design Engineers

Bench Talk


Bench Talk for Design Engineers | The Official Blog of Mouser Electronics

Humans Must Keep AI, Machines in Check Roman Yampolskiy

Unpredictability, Unexplainability, and Incomprehensibility

(Source: Production Perig/Shutterstock.com)


AI researchers are beginning to identify key challenges in engineering AI Safety, such as addressing the value-alignment problem as a primary cause of AI failures over the past 60 years. Another reality that’s come to light: The more intelligent the machine, the less humans are able to predict, explain, and comprehend its impacts.


AI unpredictability refers to our inability to precisely and consistently predict the specific actions a system will use to achieve its objectives. If we imagine an intelligent game of chess, we can predict that the AI will win—if that’s its goal—but we cannot predict the exact moves it will make to achieve its goal. The consequences in this case aren’t significant, but unpredictability scales as intelligence and complexity of goals increase. As mentioned in Part 2, suppose an AI were tasked with curing cancer; it could theoretically do so by eradicating humans.

Those interim steps depend on a few factors, including AI’s interactions throughout a process. Microsoft’s Tay(bot), featured in Part 3, started flaming others with inappropriate comments based on interactions with people online. What’s more, lower intelligence systems cannot learn to predict decisions that higher-intelligence system make. Whereas an advanced AI could theorize all possible options, decisions, or strategies, humans don’t have that capacity. This could similarly be the case for narrow systems that have higher intelligence than humans in a certain domain, even if the system were overall less competent.


Unexplainability refers to the impossibility of explaining decisions made by an intelligent system in a way that is both comprehensible and accurate. An AI used to approve or deny mortgage loans, for example, might use millions or even billions of weighted factors to make decisions. But when an applicant is rejected, the explanation would identify a factor or two such as “bad credit” or “inadequate salary.” However, that explanation is at best a simplification as to how the decision was made. This is akin to lousy image compression where data is lost in the reduction process even though the resulting image is largely representative of the original. In a similar way, explaining that a mortgage rejection was based on “bad credit” ignores the impact other factors might have had. The resulting explanation is incomplete and therefore not 100 percent accurate.

Are the other factors necessarily important to explain? They can be. In the US, for example, decisions involving lending, housing, healthcare, and similar cannot be made based on protected classes. That AI used to approve or deny mortgage loans cannot use factors such as age or sex in the decision-making process, yet such data can become factors. For example, if the mortgage company has historically denied loans to Latino women ages 18-25 without a college degree living in San Francisco, an AI might learn that applicants matching those criteria are at higher risk for defaulting on a loan, regardless of other favorable criteria. Here, unpredictability rears its head as well, but it’s a good example of why being able to explain decisions accurately and completely is important.


If the mortgage rejection were completely and accurately explained, would the explanation be comprehensible? Comprehensibility is somewhat relative to individuals; a person with a degree in finance or years of experience in the mortgage loan industry would comprehend more (or more easily) an accurate and complete explanation than someone without similar domain intelligence. That said, a detailed response from a system that considers a million differently weighted factors would be incomprehensible to humans because we don’t have the storage capacity, memory, and ability to understand that many interconnected variables.

Implications for Safe AI

Unpredictability, unexplainability, and incomprehensibility make achieving 100 percent safe AI impossible because even established standards, laws, and tools could not appropriately encourage or deter unwanted effects. Even if we were capable of predicting AI behavior, we couldn’t effectively control behaviors without limiting the value of the intelligence or systems. And, of course, evaluating and debugging AI failures require comprehensible explanations, which become less and less possible as machine intelligence increases. Up next, Part 5 explores ways that AI Safety will impact the field of engineering.

« Back

Dr. Roman V. Yampolskiy is a Tenured Associate Professor in the department of Computer Science and Engineering at the University of Louisville. He is the founding and current director of the Cyber Security Lab and an author of many books including Artificial Superintelligence: a Futuristic Approach.

All Authors

Show More Show More
View Blogs by Date