First published: 2017/11/20 (6 years ago) Abstract: Transparency, user trust, and human comprehension are popular ethical
motivations for interpretable machine learning. In support of these goals,
researchers evaluate model explanation performance using humans and real world
applications. This alone presents a challenge in many areas of artificial
intelligence. In this position paper, we propose a distinction between
descriptive and persuasive explanations. We discuss reasoning suggesting that
functional interpretability may be correlated with cognitive function and user
preferences. If this is indeed the case, evaluation and optimization using
functional metrics could perpetuate implicit cognitive bias in explanations
that threaten transparency. Finally, we propose two potential research
directions to disambiguate cognitive function and explanation models, retaining
control over the tradeoff between accuracy and interpretability.