Research Track Session RT17: Interpretability
Learning Interpretable Metric between Graphs: Convex Formulation and Computation with Graph Mining
Tomoki Yoshida (Nagoya Institute of Technology); Ichiro Takeuchi (Nagoya Institute of Technology & National Institute for Material Science & RIKEN Center for Advanced Intelligence Project);Masayuki Karasuyama (Nagoya Institute of Technology & National Institute for Material Science & Japan Science and Technology Agency)
Intro
Metric on Graph
Machine learning classifiers implicity/explicity require distance function
However neighbor classifier, kernel method, ...
Existing approach
- graph kernel, deep-net
propose: Interpretable graph metric learning(IGML)
- Subgraph Representation
- Graph can be seen as a set of subgraphs
disadvantage
- computationaly intractable
- not optimal for classification
proposed: supervised learning of metric on graphs with sparse penalty
Optimizing Distance function:
- parameterized subgraph-based distance function
- pairwise loss funxction
- pairwise loss + sparse penalty
- convex problem
- identify discriminative subgraph
- a huge number of variables needed to be optimaized
- subest-based optimization
- select small number of subgraph
- optimizing with respect to MH only for H<F
pruning on graph Mining Tree
Tree Traverse with Pruning
On Identified Subgraph Space
- observing estimated subgraph weights
- Applying smoe ML algorothm such as decision tree
conclusion
- IGML: learning metric through labeled data
- discriminative
- convex
- sparse
- identify discriminative subgraphs
Axiomatic Interpretability for Multiclass Additive Models
Xuezhou Zhang (University of Wisconsin-Madison); Sarah Tan (Cornell University); Paul Koch (Microsoft Research); Yin Lou (Ant Financial); Urszula Chajewska (Microsoft); Rich Caruana (Microsoft Research)
is binary logistic regression interpretable?
YES, given that the features themselves are interpretable.
Now, what happens if 3 class.
Diabetes, Type 1, Type 2, normal
remedy the inconsistency between
- how a model is interpretable?
model: Generalized additive model(GAM)
three classes
one feature
the generalized linear model(GLM)
generalized additive model(GAM)
Different representations of the same model
the functions are informative?
the axziom of monotonicity
problem always have unique global optimal solution
the process of construction takes almost zero computation cost
Interbility in Action on Infant Mortality
"The risk of all caused of death increase as birthweight decrease" wrong
InterpretML Demo: Global interpretation
InterpretML Demo: local interpretation
Incorporating Interpretability into Latent Factor Models via Fast Influence Analysis
Weiyu Cheng (Shanghai Jiao Tong University); Yanyan Shen (Shanghai Jiao Tong University); Linpeng Huang (Shanghai Jiao Tong University); Yanmin Zhu (Shanghai Jiao Tong University)
background
LFM is SOTA for CF.
uncertainty of the model
Influence functions
- goal of influence function is to estimate the influence of training points on a model's predictions
Interpreting LFMs with influence functions
influence analysis for MF
fast influence analysis for MF
data: yelp and movielens
experiments - effectiveness
running time of IA end IA for MF
conclusion
- explanation method based on influence function
- ...
Improving the Quality of Explanations with Local Embedding Perturbations
Yunzhe Jia (University of Melbourne); James Bailey (University of Melbourne); Kotagiri Ramamohanarao (University of Melbourne); Christopher Leckie (University
of Melbourne); Michael E. Houle (National Institute of Informatics)
importance of the generation of the appropriate neighborhood for local explanation methods
logical reasoning behind the model of a prediction
Global explanation
local explanation for the prediction
Current framework(LIME)
step1: generate synthetic neighborhood of x
step2: label the synthetic data
step3: generate the local explanation from synthetic neighborhood
"good" neighbors vs "bad" neighbors
LID a possible measurement of the quality neighbors
LID access the intrinsic dimensionality of the local data submanifold around given instance
Local intrinsic dimensionality(LID) - Intuition
- neighbors local uniform 1D
- neighbors local uniform 2D
- chaotic and far neighbors distance are .. locally ~7D
what is the local intrinsic dimensional (LID) at a point X?
Estimation of Local intrinsic dimensionality
Locally intrinsic dimensionality (LID) of different synthetic sets.
local explanation extraction framework
LEAP: local embedding aided perturbation
experiment
- explanation quality
- explanation fidelity
- methodology
Comparison of the quality of explanations generated by different perturbation
locality analysis
Summary:
- synthetic neighborhood
- generation is crucial process for local explanation extraction
- local intrinsic dimensionality can be used as a basis for generating better quality neighboorhoods
- our proposed method can be easily integrated with existing frameworks for explanation such as LIME
Log2Intent: Towards Interpretable User Modeling via Recurrent Semantics Memory Unit
Zhiqiang Tao (Northeastern University); Sheng Li (University of Georgia); Zhaowen Wang (Adobe Research); Chen Fang (ByteDance AI Lab); Longqi Yang (Cornell University); Handong Zhao (Adobe Research); Yun Fu (Northeastern University)
baskground
- User moding learnins generic user representations for downstream tasks
- Log-trace data
3 goal
- understand user behaivior
- learn user represantation
- provide personal asistance
challenge
- how to mine user's temporal characteristic(usage habit) from long log history
- how to regularize and further interpret the unstructured user log data.
tutorial dataset
- collect 2034 tutorials with various
Insight in Tutorial
- structured inside tutorial
- key: the same software annotation action always corresponds to different sentences annotations due to different temporal context
session-to-session modeling
- temporal encoder
- semantics encoder + memory network
- user action decoder
structured and unstructured User data
recurrent semantics memory unit(RMSU)
- each action <> one memory unit
- memory parameterization
- recurrent memory addressing & reading
semantic attention mechanism
dataset:
- userlogdataset
- from photoshop
- tutrial dataset
- behance dataset
- validation criteria
our approach effectively annotate tutorial log sequence via attention mechanism
log interpretation
user interest detection
task: User action Prediction
summarization
- user log-trace data enrich user representation
- log-term temporal dependency captures user habit
- user log modeling enables personalization
- recurrent semantic attentions on tutorial memory provide interpretability
Interpretable and Steerable Sequence Learning via Prototypes
Yao Ming (Hong Kong University of Science and Technology); Panpan Xu (Bosch Research North America); Huamin Qu (Hong Kong University of Science and Technology); Liu Ren (Bosch Research North America)
motivation
- interpretability
- steerablity
case-based reasoning
predicts a new case by consulting similar cases in the past
prototype learning
ProtoSequence Network(ProSeNet)
Prosenet - explanable prediction
learning readable prototype
- jointly training a sequence decoder
prototyping projection
project each prototype to its closest embedding of the training sequences
incorporating expert knowledge.
inductive biases for interpretability.
- diversity
- sparsity
- simplisity
challenges:
- duplicate prototypes
- diversity & sparsity
- verbosity -> simplicity
beam-search
experiments
- ECG heartbeat classification (time series)
- predictive diagnostic of vehicle faluts
- Protein Sequence Classification
user experiments
conclusions
- sequence learning techniques that produce interpretable and prototypes
- experiments and case studies on various real-world sequence classification tasks
- user experiment demonstrating the benefits of human-involvements in interpretability