SUT‑XR: An External Framework for Evaluating and Improving AI Explanations

Posted at 2026-04-08

SUT‑XR: An External Framework for Evaluating and Improving AI Explanations

Semantic Understanding Theory – External Rating Model

Even when AI is asked to “explain clearly,” common problems arise:

Explanations are overly long
They deviate from the intended meaning
They are redundant
The intended rationale is not conveyed

To address this, I developed SUT‑XR, an external evaluation framework for AI explanations.

This is not a method for improving the AI itself, but a framework for managing the quality of its explanations.

1. Why an “External Frame”?

Even if an AI is programmed with extensive rules:

Rules can break midway
The AI may mimic form without genuine understanding
Consistency can be lost

To address these limitations, we reverse the perspective:

Establish a layer outside the AI to evaluate its explanations.

Advantages include:

No additional computational burden on the AI
Human control over explanation quality
Ability to measure improvements via before/after comparisons

2. CISA: Evaluating Explanations Along Four Axes

An explanation can be represented as the following causal flow:

Context → Intent → Structure → Action

Each axis is scored from 0 to 1.

Context

Are the situation and assumptions clearly stated?

Intent

Is the purpose or rationale explicit?

Structure

Are concepts, causality, and flow well-organized?

Action

Are the steps concise, clear, and unambiguous?

3. Failure Modes: Eight Categories of Explanation Failures

Explanation failures fall into eight categories:

Basic Four

Context_missing
Intent_missing
Structure_missing
Action_missing

Procedural Issue

Procedure_confusion

Qualitative Failures

Inconsistency (contradictions)
Redundancy
Misalignment (misfit with user expectations)

Each failure is assigned a severity: Critical or Minor.

4. UserModel: Estimating the Type of User

Explanation effectiveness depends on user characteristics.

The framework estimates users along three dimensions:

KnowledgeLevel (Beginner → Expert)
GoalUrgency (Need to understand / Immediate solution / Fastest completion)
CognitiveStyle (Intuitive / Analytical)

CISA weights (wC, wI, wS, wA) are dynamically adjusted based on the UserModel.

Examples:

QuickTask → Action is prioritized
Learning → Structure is prioritized

5. Evidence: Estimating Understanding from User Reactions

User reactions during interaction are quantified:

ActionSuccess = successful steps / total steps
ErrorRate = mistakes / total steps
ClarificationDepth = depth of re-explanation requests
QuestionRate = questions / total conversation turns

These metrics are combined into Evidence_t.

6. UnderstandingScore: Overall Explanation Quality

Overall explanation quality is evaluated as follows:

UnderstandingScore =
  wC*C + wI*I + wS*S + wA*A
  - FailurePenalty
  - CognitiveCost

Weights w are derived from the UserModel.
Relative changes are more informative than absolute values.

7. Dynamic Adaptation (Feedback Loop)

Evidence is used to update the user’s understanding:

Understanding_t =
  α * Understanding_{t-1}
+ β * Evidence_t

QuickTask → β is higher
Learning → α is higher

Parameters are adjusted according to task type.

8. Positioning of this Theory

SUT‑XR is not an internal AI algorithm, but a layer for externally evaluating and improving AI explanations.

It sits at the intersection of:

Human–Computer Interaction (HCI)
Explainable AI
Interaction Design

9. Empirical Verification

The framework can be empirically validated through:

Comparison of before/after explanations
Scoring using CISA and Failure metrics
Observing differences in resulting scores

Summary

SUT‑XR is an external evaluation framework for AI explanations, enabling users to:

Measure explanation quality
Improve explanations
Compare before/after results

It is particularly useful for those who find AI explanations confusing or misaligned, providing a structured methodology for improvement.

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up